cyoyo

cyoyo

人生就像一场旅行,不必在乎目的地,只要沿途有美景,内心有阳光。

Cursor: Have you heard of the powerful code editor driven by GPT-4?

Cursor (https://www.cursor.so/) is a powerful code editor powered by GPT-4 that assists programmers in their daily coding tasks.

The example shown here is web scraping. The target of the scraping is the Baidu Hot Search on the Baidu homepage, as shown in the image.

I implemented it using JavaScript.

By simultaneously pressing Ctrl + K, the prompt given is to scrape the Baidu Hot Search on the website https://www.baidu.com/.

Cursor quickly provides the following code:

// Use axios and cheerio libraries to scrape the Baidu Hot Search on the website https://www.baidu.com/

const axios = require("axios");
const cheerio = require("cheerio");

async function fetchBaiduHotSearch() {
  try {
    const response = await axios.get("https://www.baidu.com/");
    const $ = cheerio.load(response.data);
    const hotSearchList = [];

    $(".hotsearch-item").each((index, element) => {
      const title = $(element).find(".title-content-title").text().trim();
      hotSearchList.push(title);
    });

    console.log(hotSearchList);
  } catch (error) {
    console.error("Error fetching Baidu Hot Search:", error);
  }
}

fetchBaiduHotSearch();

My question is: Can this code run?

GPT-4 explains the running principle of this code and also tells me to install the axios and cheerio libraries using npm.

My question is: Can this code run?

GPT-4 explains the running principle of this code and also tells me to install the axios and cheerio libraries using npm.

After running it, I found that the output result is an empty array.

I provided the error feedback to GPT-4, and it suggested checking the CSS selectors in the webpage structure.

Confirming that there is no problem with the webpage structure, I provided feedback to GPT-4. It further suggested modifying the User-Agent header of the HTTP to bypass Baidu's anti-web scraping measures.

Following the code provided by GPT-4, I made the necessary modifications. After running it, I obtained the correct result.
Below is the final generated code:


const axios = require("axios");
const cheerio = require("cheerio");

async function fetchBaiduHotSearch() {
  try {
    const response = await axios.get("https://www.baidu.com/", {
      headers: {
        "User-Agent":
          "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3",
        Referer: "https://www.baidu.com/",
      },
    });

    const $ = cheerio.load(response.data);
    const hotSearchList = [];

    $(".hotsearch-item").each((index, element) => {
      const title = $(element).find(".title-content-title").text().trim();
      hotSearchList.push(title);
    });

    console.log(hotSearchList);
  } catch (error) {
    console.error("Error fetching Baidu Hot Search:", error);
  }
}

fetchBaiduHotSearch();
Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.