欢迎!我白天是个邮递员,晚上就是个有抱负的演员。这是我的网站。我住在天朝的帝都,有条叫做Jack的狗。
临时写的一个,应用场景佷有限,大家凭自己再扩展吧,我是因为要把一个文章迁移,强制复制也不行,就写了个这玩意。
import reimport requestsfrom lxml import etreepost_url = input('请输入文章地址: ')#根提文章地址get数据res = requests. get(post_url)xx= res. content. decode('utf-8')x = etree. HTML(xx)#需要获取父级xpath#xpath示例: //*[@id="article-container"]#不会的百度吧xpath = input('请输入xpath路径, 可打开控制台查看:')content = x. xpath(xpath + '//*')ree = re. compile(r'class=".*"|id=".*"')url l = re. compile(r'(?<=(src="))(/).*?(?=("))')with open('resualt. txt', 'w', encoding='utf-8') as file:tep1 = ''for i in content:tep = etree. tostring(i, encoding='utf-8'). decode('utf-8'). strip()tep = re. sub(ree, ", tep)strr = re. search(urll, tep)#如果图片是想对路径,就自动背换成绝对路径,《需要自己寻找修改路径地址》#后面不用筒,只需要找到煎面的路径就行。就像 https://dreamtea.top#需要自己实测if strr is not None:strr r = strr. group()tep = re.sub(urll, ' https://cdn.con'+'/'+strr,tep)# print(tep)strr = Noneif tep != tep1 and tep in tep1:#print(tep)continuefile. write(tep)tep1 = tepprint('导出完成!')
这个可以再扩展成更自动的,可是我懒,希望有闲的没事的大佬扩展一下,我要借鉴(抄)~~
