¿­·¢k8¹ú¼Ê

ʹÓÃPythonÅÀ³æÊÖÒÕ½ÒÃØÄ³Ó°Ï·ÍøÕ¾µÄСӰϷÉñÃØÌìϰ¸Àý1
ȪԴ£ºÖ¤È¯Ê±±¨Íø×÷Õߣº³ÂÌÚ½¡2025-08-15 02:25:25
vsdfgweuikrbdfckjshfkhbwekrdbuiqwbejkdasbujbrqwjkrvbdbasucvbkjamb

Ëæ×Å»¥ÁªÍøµÄÉú³¤ £¬Ó°Ï·¡¢µçÊÓ¾çµÈÓ°ÊÓÄÚÈݳÉΪÈËÃÇÒ»Ñùƽ³£ÉúÑĵÄÖ÷Òª×é³É²¿·Ö¡£ÔÚÖÚ¶àÓ°Ï·ÍøÕ¾ÖÐ £¬ÓÈÆäÊÇһЩרעÓÚСӰϷ¡¢Î¢Ó°Ï·×ÊÔ´µÄƽ̨ £¬Òþ²Ø×Ÿ»ºñµÄÓ°ÊÓ×ÊÔ´ºÍDZÔÚµÄÁ÷Á¿¼ÛÖµ¡£ÃæÁÙÕâÐ©ÍøÕ¾·±ÔÓµÄÒ³Ãæ½á¹¹ºÍ·´ÅÀ»úÖÆ £¬ÏëÒª¿ìËÙ¡¢ÏµÍ³µØ»ñÈ¡×ÊÔ´±äµÃÄÑÌâÖØÖØ¡£

Õâ¸öʱ¼ä £¬PythonÅÀ³æÊÖÒÕ±ã³ÉΪÁËÐí¶àÊÖÒÕϲ»¶ÕߺÍÄÚÈÝÊÕÂÞÕßµÄÀûÆ÷¡£

ʲôÊÇÅÀ³æ£¿¼òÆÓÀ´Ëµ £¬ÅÀ³æ¾ÍÊÇָͨ¹ý±à³ÌÄ£Äâä¯ÀÀÆ÷»á¼ûÍøÒ³ £¬´ÓÍøÒ³Ô´´úÂëÖÐÌáÈ¡ËùÐèÐÅÏ¢µÄ×Ô¶¯»¯¹¤¾ß¡£ËüÄܹ»×ÊÖúÎÒÃǽÚÔ¼´ó×ÚµÄÈ˹¤²éÕÒʱ¼ä £¬¿ìËÙ½¨ÉèÊý¾Ý¼¯ £¬ÎªÊý¾ÝÆÊÎö¡¢ÄÚÈÝÕûÀíÉõÖÁ¶þ´Î¿ª·¢Ìṩ»ù´¡¡£

ÒÔÄ³Ó°Ï·ÍøÕ¾µÄСӰϷΪÀý £¬¼ÙÉèÎÒÃÇÏ£Íû£ºÒ»ÊÇץȡËùÓеÄСӰϷÐÅÏ¢£¨°üÀ¨ÎÊÌâ¡¢Á´½Ó¡¢¼ò½é¡¢Ê±³¤¡¢²¥·Å´ÎÊýµÈ£© £¬¶þÊÇ×Ô¶¯ÉúÑĵ½ÍâµØÊý¾Ý¿â»òExcel±í¸ñÖÐ £¬Àû±ãºóÐø²Ù×÷¡£¾ÝÏàʶ £¬ÕâÀàÍøÕ¾µÄÒ³Ãæ½á¹¹½ÏÎªÖØ´ó £¬Éæ¼°¶àÒ³Êý¾Ý¡¢AJAX¼ÓÔØ¡¢·´ÅÀÕ½ÂÔµÈÎÊÌâ £¬Òò´Ë £¬Éè¼ÆÒ»¿î¸ßЧ¡¢Îȹ̵ÄÅÀ³æÓÈΪҪº¦¡£

ÔÚ¿ª·¢Ö®Ç° £¬±ØÐèÃ÷È·Ä¿µÄ£ºÎÒÃÇÒªÊÕÂÞÄÄЩÐÅÏ¢£¿ÍøÕ¾µÄURL¼ÍÂÉÊÇʲô£¿Ò³ÃæµÄÊý¾Ý½á¹¹ÊÇʲôÑùµÄ£¿Õâ¶¼¹ØÏµµ½ÅÀ³æ¾ç±¾µÄÉè¼Æ¡£

ÆÊÎöÍøÒ³¡£Í¨¹ýä¯ÀÀÆ÷µÄ¿ª·¢Õß¹¤¾ß £¬¿ÉÒÔÊÓ²ìÍøÒ³Ô´Âë £¬ÕÒµ½Ä¿µÄÐÅÏ¢¶ÔÓ¦µÄHTML±êÇ©»òCSS·¾¶¡£ÀýÈç £¬Ó°Ï·ÎÊÌâ¿ÉÄÜÔÚh2±êÇ©ÄÚ £¬¼ò½éÔÚp±êÇ©ÖÐ £¬Í¼Æ¬Á´½ÓÔÚimg±êÇ©µÄsrcÊôÐԵȡ£»¹Òª×¢ÖؼÓÔØ»úÖÆ £¬ÊÇ·ñ½ÓÄÉÁËÒì²½¼ÓÔØ£¨AJAX£© £¬Õâ»áÓ°ÏìÅÀȡսÂÔ¡£

¹¹½¨ÇëÇó¡£ÓÃPythonµÄrequests¿âÄ£Äâä¯ÀÀÆ÷ÇëÇó £¬Ä£ÄâÕý³£»á¼ûä¯ÀÀÆ÷ÐÐΪ£¨ÉèÖÃheaders¡¢cookiesµÈ£© £¬×èÖ¹±»ÍøÕ¾Ê¶±ðΪÅÀ³æÄ¿µÄ¡£ÒªÉèÖúÏÀíµÄÇëÇóƵÂÊ £¬×èÖ¹±»·â½û¡£

ÔÙ´Î £¬´¦Öóͷ£·ÖÒ³¡£ÍùÍùÓ°Ï·ÄÚÈÝÂþÑÜÔÚ¶à¸öÒ³Ãæ £¬Í¨Ì«¹ýÎöURLµÄת±ä¹æÔò £¬Á¬ÏµÑ­»·ÇëÇóÖðҳץȡ¡£ÀýÈç £¬Ä³ÍøÕ¾·ÖÒ³²ÎÊý¿ÉÄÜÊÇ?page=1 £¬ºóÐøÖð²½µÝÔö¡£

ÃæÁÙ·´ÅÀ»úÖÆ £¬³£Óò½·¥°üÀ¨£ºÉèÖÃËæ»úÇëÇóÍ·¡¢Ê¹ÓÃÊðÀíIP³Ø¡¢Ä£Äâä¯ÀÀÆ÷ÐÐΪ£¨ÓÃSelenium»òPyppeteer£© £¬ÉõÖÁÄ£ÄâÓû§²Ù×÷µã»÷¼ÓÔØ¸ü¶à¡£

ÏÖʵ²Ù×÷ÖÐ £¬ÅÀÈ¡Àú³ÌÖлáÓöµ½Ðí¶àÔÖÌâ £¬ºÃ±ÈͼƬ×ÊÔ´µÄÁ´½ÓʧЧ¡¢ÄÚÈÝÖØ¸´¡¢Ò³Ãæ½á¹¹ÎÞÒâת±ä¡£Õâ¾ÍÐèҪд³öÄÍÓõÄÅÀ³æ´úÂë £¬¼ÓÈëÒì³£´¦Öóͷ£¡¢¶ÏµãÐø´«¡¢ÄÚÈÝÈ¥ÖØµÈ»úÖÆ¡£

ËäÈ» £¬ÔÚÊÕÂÞÍêÊý¾Ýºó £¬´æ´¢Ò²ÊÇÒªº¦¡£¿ÉÒÔÑ¡Ôñ´æÈëExcel¡¢CSV £¬»òÕßʹÓÃÊý¾Ý¿â£¨MySQL¡¢MongoDBµÈ£©¾ÙÐÐÖÎÀí¡£ÕâÒ»»·½Ú £¬²»µ«¹ØÏµµ½Êý¾ÝµÄÍêÕûÐÔ £¬Ò²Ó°Ïìµ½ºóÐøµÄÆÊÎöЧÂÊ¡£

×ܽáһϠ£¬ÅÀÈ¡Ä³Ó°Ï·ÍøÕ¾µÄСӰϷ £¬Ê×ÏÈÒª×öºÃÍøÒ³½á¹¹ÆÊÎö £¬ºÏÀíÉè¼ÆÇëÇóÕ½ÂÔ £¬ÇÉÓü¼ÇÉÓ¦¶Ô·´ÅÀ»úÖÆ £¬×îÖÕʵÏÖ¸ßЧÎȹ̵Ä×Ô¶¯ÊÕÂÞ¡£½ÓÏÂÀ´µÄÒ»²¿·Ö £¬ÎÒ½«´øÄãÉîÈëÏêϸµÄ´úÂëʾÀý £¬´ÓÁã×îÏÈ £¬Öð²½ÊµÏÖÍêÕûµÄÅÀ³æÁ÷³Ì £¬ÈÃÄã¶ÔÕû¸ö²Ù×÷Á÷³ÌÁËÈçÖ¸ÕÆ¡£

ÉÏÒ»²¿·ÖÎÒÃÇ̸µ½ÁËÅÀ³æµÄ»ù´¡Ô­ÀíºÍһЩӦ¶Ô·´ÅÀÕ½ÂԵļ¼ÇÉ £¬½«Í¨¹ýÏêϸ°¸Àý £¬ÏêϸÏÈÈÝÔõÑùÓÃPythonʵÏÖÒ»¸öÍêÕûµÄÅÀÈ¡Á÷³Ì¡£ÒÔÄ³Ó°Ï·ÍøÕ¾µÄ¾­µä°¸ÀýΪ¹¤¾ß £¬ÎÒÃÇ»á´ÓÇéÐÎ×¼±¸¡¢³ÌÐòÉè¼Æ¡¢Êý¾Ý´æ´¢µ½ÓÅ»¯Ë¢Ð £¬Öð²½Õö¿ªÊµ²ÙÖ¸ÄÏ¡£

Ò»¡¢ÇéÐÎ×¼±¸ÔÚ×îÏȱàÂëǰ £¬È·±£ÄãÒÑ×°ÖÃPython£¨½¨ÒéʹÓÃPython3.8¼°ÒÔÉϰ汾£© £¬ÒÔ¼°¼¸¸öÐëÒªµÄµÚÈý·½¿â £¬ºÃ±Èrequests¡¢BeautifulSoup¡¢pandas £¬ÒÔ¼°¿ÉÄÜÓõ½µÄSelenium»òPyppeteer¡£

pipinstallrequestsbeautifulsoup4pandasselenium

¶þ¡¢ÍøÒ³ÆÊÎöÓÃä¯ÀÀÆ÷µÄ¿ª·¢Õß¹¤¾ß£¨F12£©ÊÓ²ìÄ¿µÄÍøÒ³ £¬ÕÒµ½ÒÔÏÂÒªº¦ÔªËØ£º

ÁбíÒ³ÃæµÄURL¼ÍÂÉ£¨ÀýÈ磺http://example.com/movies?page=1£©Ð¡Ó°Ï·µÄÌõÄ¿½á¹¹£¨ºÃ±È£ºÃ¿¸öÓ°Ï·ÔÚÄÚ£©Ó°Ï·µÄÏêϸÐÅÏ¢ÔÚÄÇÀÎÊÌâÔÚ

ÖÐ £¬¼ò½éÔÚÄÚ£©Èý¡¢»ù±¾ÅÀÈ¡Á÷³Ì»ñÈ¡Ò³ÃæÔ´Â룺ͨ¹ýrequestsÌᳫÇëÇó £¬Ä£Äâä¯ÀÀÆ÷Í·ÐÅÏ¢¡£ÆÊÎöÒ³ÃæÄÚÈÝ£ºÓÃBeautifulSoup¶¨Î»Ä¿µÄ±êÇ© £¬³éÈ¡ÓÐÓÃÐÅÏ¢¡£Ò»Á¬·­Ò³£ºÔÚURLÖÐÐÞ¸ÄÒ³Âë²ÎÊý £¬ÊµÏÖÅúÁ¿×¥È¡¡£´æ´¢Êý¾Ý£ºÕûºÏÐÅÏ¢ £¬Êä³öµ½Excel»òÊý¾Ý¿â¡£¹æ·¶´úÂëÈçÏ£ºimportrequestsfrombs4importBeautifulSoupimportpandasaspdimporttimeimportrandomheaders={'User-Agent':'Mozilla/5.0(WindowsNT10.0;Win64;x64)...'}deffetch_page(url):try:response=requests.get(url,headers=headers,timeout=10)ifresponse.status_code==200:returnresponse.textelse:print(f"ÇëÇóʧ°Ü £¬×´Ì¬Â룺{response.status_code}")returnNoneexceptrequests.RequestExceptionase:print(f"ÇëÇóÒì³££º{e}")returnNonedefparse_page(html):soup=BeautifulSoup(html,'html.parser')movies=soup.find_all('div',class_='movie-item')result=[]formovieinmovies:title=movie.find('h2').get_text(strip=True)link=movie.find('a')['href']desc=movie.find('p',class_='desc').get_text(strip=True)result.append({'ÎÊÌâ':title,'Á´½Ó':link,'¼ò½é':desc})returnresultmax_pages=10#ÉèÖÃ×î´óÅÀȡҳÊýbase_url='http://example.com/movies?page='all_movies=[]forpageinrange(1,max_pages+1):url=base_url+str(page)print(f"ÕýÔÚץȡµÚ{page}Ò³£º{url}")html=fetch_page(url)ifhtml:movies=parse_page(html)all_movies.extend(movies)time.sleep(random.uniform(1,3))#ÉèÖÃÅÀÈ¡¾àÀë £¬½µµÍ·â½ûΣº¦else:print("»ñÈ¡Ò³ÃæÊ§°Ü £¬Ìø¹ý¡£")#½«ÊÕÂÞµ½µÄÊý¾ÝÉúÑĵ½Exceldf=pd.DataFrame(all_movies)df.to_excel('СӰϷ×ÊÔ´.xlsx',index=False)print("Êý¾ÝÒÑÉúÑĵ½Ð¡Ó°Ï·×ÊÔ´.xlsx")ËÄ¡¢Ó¦¶ÔÒ³Ãæ½á¹¹×ª±äÍøÒ³½á¹¹²»ÊÇÒ»³ÉÎȹ̵Ä £¬Òò´ËÒ»¶¨ÒªÐ´³öÎȽ¡µÄ´úÂë £¬ºÃ±È£ºÊ¹ÓÃtry/except²¶»ñÒì³£°´ÆÚ¼ì²éÍøÒ³Ô´´úÂë £¬ÊµÊ±µ÷½âÆÊÎöÂß¼­Ê¹ÓÃXPath»òCSSSelectorÌá¸ß¶¨Î»¾«×¼¶ÈÎå¡¢·´ÅÀ²½·¥µÄÓ¦¶ÔÕë¶ÔÒ»Ð©ÍøÕ¾¿ÉÄܵķ´ÅÀ²½·¥ £¬¿ÉÒÔ£ºÊ¹ÓÃÊðÀíIPʵÏÖIPÂÖ»»Ê¹ÓÃSeleniumÄ£Äâä¯ÀÀÆ÷¼ÓÔØAjaxÄÚÈÝ¿ØÖÆÇëÇóƵÂÊ £¬×èֹƵÈÔ»á¼ûÉèÖÃÇëÇóÍ·µÄÒ»ÖÂÐÔ £¬Î±×°³Éä¯ÀÀÆ÷Áù¡¢À©Õ¹¹¦Ð§³ýÁË»ù±¾µÄץȡ £¬»¹¿ÉÒÔ£º×Ô¶¯ÏÂÔØÓ°Ï·Ô¤ÀÀͼ¡¢Æ¬¶ÏͼƬʵÏÖ¶àÏ̡߳¢¶àÀú³Ìץȡ £¬ÒÔÌá¸ßЧÂÊʹÓÃScrapyµÈרҵÅÀ³æ¿ò¼ÜÖÎÀíÖØ´óÏîÄ¿¹¹½¨×Ô¼ºµÄÊý¾Ý¿â £¬¾ÙÐÐÄÚÈÝ·ÖÀà¡¢±êÇ©¡¢É¸Ñ¡Æß¡¢×ܽáÓëÕ¹Íûͨ¹ýÕâ´Îʵս°¸Àý £¬ÏàÐÅÄãÒѾ­¶ÔPythonÅÀ³æ´ÓÆÊÎöÍøÒ³¡¢ÇëÇóÊý¾Ý¡¢ÆÊÎöÄÚÈÝ¡¢µ½´æ´¢×ÊÔ´µÄÍêÕûÁ÷³ÌÓÐÁËÃ÷È·ÊìϤ¡£Î´À´ £¬¿ÉÒÔÁ¬ÏµÉî¶Èѧϰ¡¢Í¼Ïñʶ±ðµÈÊÖÒÕ £¬ÍÚ¾ò¸ü¸»ºñµÄÄÚÈÝ×ÊÔ´¡£ÅÀ³æ²»µ«ÔÚÓ°ÊÓÄÚÈÝÊÕÂÞÉÏÓÎÈÐÓÐÓà £¬Ò²ÆÕ±éÓ¦ÓÃÓÚÐÂÎÅ¡¢½ðÈÚ¡¢¿ÆÑС¢µç×ÓÉÌÎñµÈÖÚ¶àÐÐÒµ¡£Ð¡Ó°Ï·µÄÌìÏÂÎÞÏÞ¾«²Ê £¬Ö»ÒªÕÆÎÕÁËÅÀ³æÊÖÒÕ £¬Äã¾ÍÄÜ¿ìËÙÈëÃÅ £¬Ì½Ë÷ÆäÖеÄÉñÃØ¡£Î´À´µÄõè¾¶ÉÏ £¬Êý¾ÝµÄʵÁ¦Ô´Ô´Ò»Ö± £¬ÆÚ´ýÄãµÄÓ¸Ò̽Ë÷ºÍÎÞаӦÓá£ÈÃÎÒÃÇÓÃPythonÅÀ³æ £¬¿ªÆôÁíÒ»¸öÐÅÏ¢º£ÑóµÄº½³Ì£¡

Àï°º£º½µ°ÙÍþÑÇ̫ĿµÄ¼ÛÖÁ9.3¸ÛÔª ά³Ö¡°ÅÜÓ®´óÊС±ÆÀ¼¶
ÔðÈα༭£º °¢¶û¿Ï¡¤°¬±È²¼À­
ÉùÃ÷£ºÖ¤È¯Ê±±¨Á¦ÕùÐÅÏ¢ÕæÊµ¡¢×¼È· £¬ÎÄÕÂÌá¼°ÄÚÈݽö¹©²Î¿¼ £¬²»×é³ÉʵÖÊÐÔͶ×ʽ¨Òé £¬¾Ý´Ë²Ù×÷Σº¦×Ôµ£
ÏÂÔØ¡°Ö¤È¯Ê±±¨¡±¹Ù·½APP £¬»ò¹Ø×¢¹Ù·½Î¢ÐŹ«ÖںŠ£¬¼´¿ÉËæÊ±Ïàʶ¹ÉÊж¯Ì¬ £¬¶´²ìÕþ²ßÐÅÏ¢ £¬ÕÆÎղƲúʱ»ú¡£
ÍøÓÑ̸ÂÛ
µÇ¼ºó¿ÉÒÔ½²»°
·¢ËÍ
ÍøÓÑ̸ÂÛ½ö¹©Æä±í´ïСÎÒ˽¼Ò¿´·¨ £¬²¢²»Åúע֤ȯʱ±¨Ì¬¶È
ÔÝÎÞ̸ÂÛ
ΪÄãÍÆ¼ö
ÎåÖÞдº£ºÎÞÓâÆÚµ£±£
//1
¡¾ÍøÕ¾µØÍ¼¡¿¡¾sitemap¡¿