Fun to play! Connect the article page to the large model.

Recently, I integrated TianliGPT into the article, achieving an automatic article summary (TL;DR) module. TianliGPT provides a very simple and quick embedding method, but it focuses on summary generation and related article recommendation features. If you want to expand its capabilities, there are significant limitations. Therefore, I recently abandoned TianliGPT in favor of using Moonshot AI for article summary generation and additional feature expansion.

Determine Requirements#

First, we want to not only generate summaries for articles but also pose relevant questions about the content of the articles along with corresponding answers. Clicking on a question will display the answer to that question. The effect is similar to the image below:

Article Summary Module Preview

Based on the above requirements, we need the model to return content in a JSON format similar to the one below, which will then be processed by the frontend:

{
    "summary": "Article summary content",
    "qa": [
        {
            "question": "Question 1",
            "answer": "Answer 1"
        },{
            "question": "Question 2",
            "answer": "Answer 2"
        },
        ...
    ]
}

We can then design the following prompt to give to the model:

Design a concise list of questions aimed at extracting professional concepts or inadequately detailed aspects from the article. Provide an article summary and formulate 6 questions regarding specific concepts. The questions should be precise, and generate corresponding answers. Please output your response in the following JSON format:
{
    "summary": "Article summary content",
    "qa": [
        {
            "question": "Question 1",
            "answer": "Answer 1"
        },
        ...
    ]
}
Note: Please place the article summary content in the summary field, the questions in the question field, and the answers in the answer field.

Note

A prompt refers to the information or instructions provided to the model to guide it in generating specific responses or outputs. It determines how the model understands and processes user input. An effective prompt typically includes the following elements: clarity, relevance, conciseness, context, and directive.

Since Kimi and the model provided by Moonshot AI are from the same source, we can use Kimi for testing to predict the results we might obtain when using the Moonshot API to some extent ~~(actually to save money)~~. To verify whether this prompt can achieve the desired effect, I tried using it in a conversation with Kimi, and the results are shown in the image below:

Model Conversation Result

Initiating a Conversation with the Model#

After resolving what to communicate with the model, we need to address how to communicate with the model. The official documentation of Moonshot AI provides implementation methods in both Python and Node.js, while here we will use PHP to implement the corresponding functionality.

The official API provided for us is Chat Completions: https://api.moonshot.cn/v1/chat/completions, and examples of the request headers and request content are as follows:

# Request Headers
{
    "Content-Type": "application/json",
    "Authorization": "Bearer $apiKey"
}

# Request Content
{
    "model": "moonshot-v1-8k",
    "messages": [
        {
            "role": "system",
            "content": "You are Kimi, an AI assistant provided by Moonshot AI, and you are better at conversations in Chinese and English. You will provide users with safe, helpful, and accurate answers. At the same time, you will refuse to answer any questions involving terrorism, racial discrimination, pornography, violence, etc. Moonshot AI is a proper noun and should not be translated into other languages."
        },
        { "role": "user", "content": "Hello, my name is Li Lei, what is 1+1?" }
    ],
    "temperature": 0.3
}

model is the model name; Moonshot-v1 currently has three models: moonshot-v1-8k, moonshot-v1-32k, and moonshot-v1-128k.
The messages array is a list of conversation messages, where the role can be system, user, or assistant: system represents system messages that provide context or guidance for the conversation, usually filled with the prompt; user represents the user's message, i.e., the user's question or input; assistant represents the model's reply.
temperature is the sampling temperature, recommended to be 0.3.

We can construct a MoonshotAPI class to implement this functionality:

class MoonshotAPI {
    private $apiKey;
    private $baseUrl;

    public function __construct($apiKey) {
        $this->apiKey = $apiKey;
        $this->baseUrl = "https://api.moonshot.cn/v1/chat/completions";
    }
    
    /**
     * Send and get API response data
     * @param string $model Model name
     * @param array $messages Message array
     * @param float $temperature Temperature parameter affecting the creativity of the response
     * @return mixed API response data
     */
    public function sendRequest($model, $messages, $temperature) {
        $payload = $this->preparePayload($model, $messages, $temperature);
        $response = $this->executeCurlRequest($payload);
        $responseData = json_decode($response, true);
        
        if (json_last_error() !== JSON_ERROR_NONE) {
            throw new RuntimeException("Invalid response format.");
        }
        
        return $responseData;
    }

    /**
     * Construct request content
     * @param string $model Model
     * @param array $messages List of conversation messages
     * @param float $temperature Sampling temperature
     * @return array Constructed request content
     */
    private function preparePayload($model, $messages, $temperature) {
        return [
            'model' => $model,
            'messages' => $messages,
            'temperature' => $temperature,
            'response_format' => ["type" => "json_object"] # Enable JSON Mode
        ];
    }

    /**
     * Send request
     * @param array $data Request data
     * @return string API response data
     */
    private function executeCurlRequest($data) {
        $curl = curl_init();
        curl_setopt_array($curl, [
            CURLOPT_URL => $this->baseUrl,
            CURLOPT_RETURNTRANSFER => true,
            CURLOPT_POST => true,
            CURLOPT_POSTFIELDS => json_encode($data),
            CURLOPT_TIMEOUT => 60,
            CURLOPT_HTTPHEADER => [
                'Content-Type: application/json',
                'Authorization: Bearer ' . $this->apiKey
            ],
        ]);

        $response = curl_exec($curl);

        if ($response === false) {
            $error = curl_error($curl);
            curl_close($curl);
            throw new RuntimeException($error);
        }

        curl_close($curl);
        return $response;
    }
}

Note

If you directly tell Kimi the large model in the prompt: "Please output content in JSON format," Kimi can understand your request and will generate a JSON document as required, but the generated content usually has some flaws, such as outputting additional text outside the JSON document to explain it.

So here we need to enable JSON Mode when constructing the request content, allowing the model to "output a valid, correctly parsable JSON document as requested," which means adding 'response_format' => ["type" => "json_object"] to the return array of the preparePayload method.

It is evident that among the three parameters accepted by the MoonshotAPI class, only the $messages conversation message list is relatively complex. Therefore, we create a getMessages function to construct the conversation message list array.

/**
 * Build an array of messages
 *
 * @param string $articleText Article content
 * @return array Array containing system and user messages
 */
function getMessages($articleText) {
    return [
        [
            "role" => "system",
            "content" => <<<EOT
Design a concise list of questions aimed at extracting professional concepts or inadequately detailed aspects from the article. Provide an article summary and formulate 6 questions regarding specific concepts. The questions should be precise, and generate corresponding answers. Please output your response in the following JSON format:
{
    "summary": "Article content",
    "qa": [
        {
            "question": "Question 1",
            "answer": "Answer 1"
        },
        ...
    ]
}
Note: Please place the article summary content in the summary field, the questions in the question field, and the answers in the answer field.
EOT
        ],
        [
            "role" => "user",
            "content" => $articleText
        ]
    ];
}

Here we fill in the initially designed prompt into the system message and the article content into the first user message. In the actual API request, the messages array will be arranged in chronological order, usually starting with the system message, followed by the user question, and finally the assistant reply. This structure helps maintain the context and coherence of the conversation.

When we communicate with the model, it will return JSON data like this, where we mainly focus on the choices array.

{
	"id": "chatcmpl-xxxxxx",
	"object": "chat.completion",
	"created": xxxxxxxx,
	"model": "moonshot-v1-8k",
	"choices": [{
		"index": 0,
		"message": {
			"role": "assistant",
			"content": "Here is the model's reply"
		},
		"finish_reason": "stop"
	}],
	"usage": {
		"prompt_tokens": 229,
		"completion_tokens": 64,
		"total_tokens": 293
	}
}

In our conversation mode, the choices array usually contains only one object (which is why we write $result['choices'][0] when retrieving the model's reply and other information), and this object represents the text reply generated by the model. The finish_reason within the object indicates the reason for the completion of the generated text; if the model believes it has provided a complete answer, the value of finish_reason will be stop. Therefore, we can use this to determine whether the content generated by the model is complete. The content within the object is the reply given to us by the model.

Next, we create the Moonshot function to call the MoonshotAPI class:

/**
 * Call the MoonshotAPI class
 *
 * @param string $articleText Article content
 * @param string $model The model being used, default is "moonshot-v1-8k"
 * @return array Returns an array with status code and data
 */
function Moonshot($articleText, $model = "moonshot-v1-8k") {
    $apiKey = 'sk-xxxxxxxx'; # $apiKey is the API Key applied for in the user center
    $moonshotApi = new MoonshotAPI($apiKey);
    $messages = getMessages($articleText);
    $temperature = 0.3;

    try {
        $result = $moonshotApi->sendRequest($model, $messages, $temperature);

        if (isset($result['error'])) {
            throw new RuntimeException("Model returned an error: " . $result['error']['message']);
        }

        # Check if the content generated by the model is complete
        $responseContent = $result['choices'][0]['message']['content'] ?? null;
        if ($responseContent === null || $result['choices'][0]['finish_reason'] !== "stop") {
            throw new RuntimeException("Returned content does not exist or is truncated.");
        }

        # Since we enabled JSON Mode, the model's reply is in standard JSON format
        # So we need to filter out non-standard JSON format replies
        $decodedResponse = json_decode($responseContent, true);
        if (json_last_error() !== JSON_ERROR_NONE) {
            throw new RuntimeException("Invalid response format.");
        }
    
        return $result;
    } catch (Exception $e) {
        return ['stat' => 400, 'message' => $e->getMessage()];
    }
}

Now we have the code as shown below. If there are no unexpected issues, calling it directly will yield the model's reply✌️. The frontend can then render the reply on the page, which will not be elaborated further here.

header('Content-Type: application/json');

class MoonshotAPI {...}

function getMessages(...) {...}

function Moonshot(...) {...}

# Usage Example
try {
    $article = "This is the article content";
    $aiResponse = Moonshot($article);

    # Directly output, or perform further processing on the model's returned result
    echo json_encode($aiResponse, JSON_UNESCAPED_UNICODE);
} catch (Exception $e) {
    echo json_encode(['stat' => 400, 'message' => $e->getMessage()], JSON_UNESCAPED_UNICODE);
}

Appendix: Handling Long Articles#

For some articles, directly calling the above code may result in the following error:

{
	"error": {
		"type": "invalid_request_error",
		"message": "Invalid request: Your request exceeded model token limit: 8192"
	}
}

The reason for this error is that the total number of input and output context tokens exceeds the model's (here we assume we are using the moonshot-v1-8k model) set token limit. We need to choose an appropriate model based on the length of the context, plus the expected output token length. For this situation, the documentation also provides an example code on how to choose the appropriate model based on context length, and we just need to convert the code to PHP and integrate it into our previous code.

/**
 * Estimate the token count for given messages.
 * Use the estimate-token-count API provided in the documentation to estimate the token count of input messages.
 * 
 * @param string $apiKey API key
 * @param array $inputMessages Message array to estimate token count
 * @return int Returns the estimated total token count
 */
function estimateTokenCount($apiKey, $inputMessages) {
    $header = [
        'Authorization: Bearer ' . $apiKey,
    ];
    $data = [
        'model' => 'moonshot-v1-128k',
        'messages' => $inputMessages,
    ];

    $curl = curl_init();
    curl_setopt_array($curl, [
        CURLOPT_URL => 'https://api.moonshot.cn/v1/tokenizers/estimate-token-count',
        CURLOPT_RETURNTRANSFER => true,
        CURLOPT_POST => true,
        CURLOPT_POSTFIELDS => json_encode($data),
        CURLOPT_TIMEOUT => 60,
        CURLOPT_HTTPHEADER => [
            'Authorization: Bearer ' . $apiKey
        ],
    ]);
    $response = curl_exec($curl);
    if ($response === false) {
        $error = curl_error($curl);
        curl_close($curl);
        throw new RuntimeException($error);
    }
    curl_close($curl);
    $result = json_decode($response, true);
    return $result['data']['total_tokens'];
}

/**
 * Choose the most suitable model based on the estimated token count.
 * 
 * @param string $apiKey API key
 * @param array $inputMessages Message array
 * @param int $defaultMaxTokens Default maximum token limit
 * @return string Returns the selected model name
 */
function selectModel($apiKey, $inputMessages, $defaultMaxTokens = 1024) {
    $promptTokenCount = estimateTokenCount($apiKey, $inputMessages);
    $totalAllowedTokens = $promptTokenCount + $defaultMaxTokens;

    if ($totalAllowedTokens <= 8 * 1024) {
        return "moonshot-v1-8k";
    } elseif ($totalAllowedTokens <= 32 * 1024) {
        return "moonshot-v1-32k";
    } elseif ($totalAllowedTokens <= 128 * 1024) {
        return "moonshot-v1-128k";
    } else {
        throw new Exception("Tokens exceed maximum limit.");
    }
}

In the Moonshot function, when the model returns an error of type invalid_request_error (i.e., exceeding the model's maximum token limit), we call the selectModel function to choose the most suitable model and then re-engage in the conversation with the appropriate model.

function Moonshot($articleText, $model = "moonshot-v1-8k") {
    ...
        if (isset($result['error'])) {
            if ($result['error']['type'] === "invalid_request_error") {
                $model = selectModel($apiKey, $messages);
                return Moonshot($articleText, $model);
           } else {
               throw new RuntimeException("Model returned an error: " . $result['error']['message']);
           }
        }
    ...
}

This article is updated by Mix Space to xLog. The original link is https://www.vinking.top/posts/codes/developing-auto-summary-module-using-ai