Python crawler example tutorial-Douban movie rankings--python crawler requests library

2021/09/0521:59:07 technology 116

In the first few lessons, we conducted simple web page collection and Baidu translation through the requests library. In this lesson, we will continue to explain the case-the Douban Movie Ranking List of the Python Crawler Example Tutorial, this case Similar to the case of the previous lesson, it will also involve JSON modules, asynchronous loading and partial loading methods. Next, we will explain the operation methods one by one.

1. Main content obtained

We mainly pass the Douban Movie Ranking (https://movie.douban.com/typerank?type_name=%E5%96%9C%E5%) 89%A7&type=24&interval_id=100:90&action= )

This website gets the related information of the movie, such as the link, title, rating, etc. (see below)

img0p pimg_br5

2. Analyze problem-solving ideas

First we open the URL we want to crawl, we will find that by dragging the mouse slider, the movie is constantly being loaded, and the URL does not change, so Can we immediately think of the case that we did in the previous lesson. Baidu search has the same effect-ajax is asynchronous, so we can get URL information, headers, keywords and other information, we can no longer view it through all, but choose xpath to view (as follows Figure)

3. Write the code

The first step,Import the requests module

The second step is to get information such as url, parameters, headers, etc.

aja Get url, parameters, headers information through xpath (as follows)

We also know from the above figure that the request type of the web page is get, and the response type is JSON, so the code It is as follows:

Note that:

(1) The URL parameter “limit” has been removed from the “limit” of

_p5=1.

(2) The value of "limit" in the parameter is changed to 100. The reason is that "limit" represents the number of movies. We don't just want to get information about 1 movie, we want to get 100, of course the number can be according to needs Change

to learn more

technology

On November 3, at the 2022 Hangzhou Cloud Conference, Alibaba announced new progress in its self-developed computing power system. Its self-developed CPU Yitian 710 has been deployed on a large scale in the data center and serves Alibaba and many Internet technology companies in

Alibaba's self-developed CPU has been widely used, and Alibaba Cloud has added 20% of its computing power to achieve major breakthroughs

07/11 1435

On November 3, Alibaba announced at the 2022 Yunqi Conference that the self-developed CPU Yitian 710 has been widely used, and 20% of Alibaba Cloud's new computing power will use self-developed CPUs in the next two years, which is an important breakthrough in Alibaba's computing

Alibaba Cloud deploys self-developed CPU Yitian 710 on a large scale, and its computing power cost performance has increased by more than 30%.

07/11 1098

It is an indisputable fact that the mobile phone industry is showing a downward trend. Major companies are lowering sales expectations, and related industrial chains are also tightening production capacity. The overall situation is not optimistic. Ren Zhengfei also warned that th

Global mobile phones decline, Apple iPhone sales "sharply" and the Chinese market contributes the most

07/11 1115

According to Bloomberg's Mark Gurman, Apple will continue to purchase modem chips from Qualcomm for the 2023 iPhone 15 series. Qualcomm said it will provide "the vast majority" modem chips for Apple's devices.

Apple will continue to use Qualcomm modem chips for 2023 iPhone 15 models

07/11 1968

I was shocked when I was surfing the Internet - many netizens were saying that their WeChat was logged in by a strange device. ah? What about my WeChat? Go and check it quickly. Following the path of "WeChat - I - Settings - Account and Security - Logined Devices", I found a seri

Unfamiliar device logged into my WeChat? What's going on?

07/11 1493

On November 3, the 2022 Yunqi Conference opened in Yunqi Town, Hangzhou. Thousands of major guests, more than 60 science and technology summits and forums, and more than 1,000 new technology products have been released... The Yunqi Conference has attracted much attention from the

Don’t understand low code, just like you don’t know how to use word twenty years ago. Zhang Jianfeng, president of Alibaba Cloud: The changes brought about by cloud computing are greatly underestimated.

07/11 1170