Discussions
<img src="http://www.xxxxxx.jpg" slt="">???
Target Site html
<div class="photo">
<img src="http://www.xxxxxx.jpg" slt="" >
I suppose I can get jpg link information with below.
object.img= $html.find('div.photo').find('img').attr('src');
but I can't get.
how can I get jpg image link information?
Posted by MIKIO FUJITA over 3 years ago
Where are my files?
I've run crawl Argenprop-url-and-details_7-restarted 1532659580398. I cannot find the fies it generated. Please answer ASAP. Your service is not working for me
Posted by Alejandro almost 4 years ago
Could not scrape data from amazon.co.jp
I could not get html data from amazon.co.jp when we tried yesterday,
TargetURL: https://www.amazon.co.jp/s?i=hobby&bbn=2189632051&rh=n%3A2277721051%2Cn%3A2277722051%2Cn%3A2189632051%2Cp_n_feature_fifteen_browse-bin%3A3307621051&s=date-desc-rank&page=155&pf_rd_i=2189632051&pf_rd_m=A3P5ROKL5A1OLE&pf_rd_p=cf2542d6-8f93-4f8b-8803-343c480de726&pf_rd_r=6RSZ5NDTY1HYWG4670MK&pf_rd_s=merchandised-search-6&pf_rd_t=101&qid=1563941970&ref=sr_pg_155
The result of scraping was below
```
<!DOCTYPE html>
<!--[if lt IE 7]> <html lang="jp" class="a-no-js a-lt-ie9 a-lt-ie8 a-lt-ie7"> <![endif]-->
<!--[if IE 7]> <html lang="jp" class="a-no-js a-lt-ie9 a-lt-ie8"> <![endif]-->
<!--[if IE 8]> <html lang="jp" class="a-no-js a-lt-ie9"> <![endif]-->
<!--[if gt IE 8]><!-->
<html class="a-no-js" lang="jp"><!--<![endif]--><head>
<meta http-equiv="content-type" content="text/html; charset=Shift_JIS">
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
<title dir="ltr">Amazon CAPTCHA</title>
<meta name="viewport" content="width=device-width">
<link rel="stylesheet" href="https://images-na.ssl-images-amazon.com/images/G/01/AUIClients/AmazonUI-3c913031596ca78a3768f4e934b1cc02ce238101.secure.min._V1_.css">
<script>
if (true === true) {
var ue_t0 = (+ new Date()),
ue_csm = window,
ue = { t0: ue_t0, d: function() { return (+new Date() - ue_t0); } },
ue_furl = "fls-fe.amazon.co.jp",
ue_mid = "A1VC38T7YXB528",
ue_sid = (document.cookie.match(/session-id=([0-9-]+)/) || [])[1],
ue_sn = "opfcaptcha.amazon.co.jp",
ue_id = 'KKTM8F5RHSCN88RHYEX8';
}
</script>
</head>
<body>
<!--
To discuss automated access to Amazon data please contact [email protected]
For information about migrating to our APIs refer to our Marketplace APIs at https://developer.amazonservices.jp/ref=rm_c_sv, or our Product Advertising API at https://affiliate.amazon.co.jp/gp/advertising/api/detail/main.html/ref=rm_c_ac for advertising use cases.
-->
<!--
Correios.DoNotSend
-->
<div class="a-container a-padding-double-large" style="min-width:350px;padding:44px 0 !important">
<div class="a-row a-spacing-double-large" style="width: 350px; margin: 0 auto">
<div class="a-row a-spacing-medium a-text-center"><i class="a-icon a-logo"></i></div>
<div class="a-box a-alert a-alert-info a-spacing-base">
<div class="a-box-inner">
<i class="a-icon a-icon-alert"></i>
<h4>���ɕ\������Ă��镶������͂��Ă�������</h4>
<p class="a-last">�\����܂��A���q�l�����{�b�g�łȂ����Ƃ��m�F�����Ă��������K�v������܂��B�ŗǂ̂������ŃA�N�Z�X���Ă����������߂ɁA���g���̃u���E�U���N�b�L�[������Ă��邱�Ƃ����m�F���������B</p>
</div>
</div>
<div class="a-section">
<div class="a-box a-color-offset-background">
<div class="a-box-inner a-padding-extra-large">
<form method="get" action="/errors/validateCaptcha" name="">
<input type=hidden name="amzn" value="vMTrEHkdsJiaQr9x5UfAgA==" /><input type=hidden name="amzn-r" value="/s?i=hobby&bbn=2189632051&rh=n%3A2277721051%2Cn%3A2277722051%2Cn%3A2189632051%2Cp_n_feature_fifteen_browse-bin%3A3307621051&s=date-desc-rank&page=2&pf_rd_i=2189632051&pf_rd_m=A3P5ROKL5A1OLE&pf_rd_p=cf2542d6-8f93-4f8b-8803-343c480de726&pf_rd_r=6RSZ5NDTY1HYWG4670MK&pf_rd_s=merchandised-search-6&pf_rd_t=101&qid=1563942549&ref=sr_pg_2" />
<div class="a-row a-spacing-large">
<div class="a-box">
<div class="a-box-inner">
<h4>���̉摜�Ɍ����镶������͂��Ă�������:</h4>
<div class="a-row a-text-center">
<img src="https://images-na.ssl-images-amazon.com/captcha/qujzzelu/Captcha_lewcclnfpa.jpg">
</div>
<div class="a-row a-spacing-base">
<div class="a-row">
<div class="a-column a-span6">
</div>
<div class="a-column a-span7 a-span-last a-text-right">
<a onclick="window.location.reload()">�ʂ̉摜�ɂ��Ă�������</a>
</div>
</div>
<input autocomplete="off" spellcheck="false" placeholder="��������͂��Ă�������" id="captchacharacters" name="field-keywords" class="a-span12" autocapitalize="off" autocorrect="off" type="text">
</div>
</div>
</div>
</div>
<div class="a-section a-spacing-extra-large">
<div class="a-row">
<span class="a-button a-button-primary a-span12">
<span class="a-button-inner">
<button type="submit" class="a-button-text">�V���b�s���O�𑱂���</button>
</span>
</span>
</div>
</div>
</form>
</div>
</div>
</div>
</div>
<div class="a-divider a-divider-section"><div class="a-divider-inner"></div></div>
<div class="a-text-center a-spacing-small a-size-mini">
<a href="https://www.amazon.co.jp/gp/help/customer/display.html/ref=footer_cou/376-1267051-7966065?ie=UTF8&nodeId=643006">���p�K��</a>
<span class="a-letter-space"></span>
<span class="a-letter-space"></span>
<span class="a-letter-space"></span>
<span class="a-letter-space"></span>
<a href="https://www.amazon.co.jp/gp/help/customer/display.html/ref=footer_privacy/376-1267051-7966065?ie=UTF8&nodeId=643000">�v���C�o�V�[�K��</a>
</div>
<div class="a-text-center a-size-mini a-color-secondary">
© 1996-2013, Amazon.com, Inc. or its affiliates
<script>
if (true === true) {
document.write('<img src="https://fls-fe.amaz'+'on.co.jp/'+'1/oc-csi/1/OP/requestId=KKTM8F5RHSCN88RHYEX8&js=1" />');
};
</script>
<noscript>
<img src="https://fls-fe.amazon.co.jp/1/oc-csi/1/OP/requestId=KKTM8F5RHSCN88RHYEX8&js=0" />
</noscript>
</div>
</div>
<script>
if (true === true) {
var elem = document.createElement("script");
elem.src = "https://images-fe.ssl-images-amazon.com/images/G/01/csminstrumentation/csm-captcha-instrumentation.min._V" + (+ new Date()) + "_.js";
document.getElementsByTagName('head')[0].appendChild(elem);
}
</script>
</body></html>
```
Posted by Genki almost 3 years ago
Crawls not running
All the crawls I submitted today, get queued and then their status changes to STARTED as usual, but the '# of URLs Crawled' does not change. Apparently the crawls aren't actually doing anything.
I tried rerunning crawls from a couple of days ago which ran perfectly fine, but I get the same problem.
Does someone know what might be the problem?
Posted by Freddi Sautter almost 4 years ago
Completed Crawl File Links MISSING
As of this morning, every "Completed" crawl is missing links to the JSON files. Yesterday, they were all there. Now, they're all missing. Please resolve this ASAP.
Posted by Mark Mindlin almost 4 years ago
Inconsistent crawling for same set of data
Hi,
We were scraping one of the websites by making our own app and URL list. Now, some weird behavior we are observing i.e. for the same set of URLs we are getting different outputs. It is inconsistent.
Is it because all the 80legs IPs are blacklisted by that particular website?
Kindly reply to the above issue.
Thanks!
Posted by Romil Shah over 1 year ago
HELP !!!
Hi,
I'm trying the product and created a crawl with the following link:
http://www.bing.es/search?q=Agile+Coach+en+Madrid&count=100&first=600
I also placed to extract emails and go 10 levels inside, but nothing happens (returns 0 and says completed). What am I doing wrong?
You can check all the craws in my account and you'll see that none of them work.
Posted by erich over 1 year ago
Download Option Not Working for Completed Files
First time using the new interface. My crawl runs fine (EmailCollector.js) - but when I attempt to download the completed file:
a - I no longer see the option to download as a csv file
b - when I click the download link for json - the json displays in the browser - but no file is downloaded.
Thanks
Posted by Fred Harrell almost 4 years ago
Amazon scraping
Is it possible to crawl Amazon and get buy box prices and other info using a list of ASINs?
If possible, how?
Posted by T Nakamura over 1 year ago
Crawl limit is lower than my plan
Trying to start a new crawl this morning, but for some reason I am being limited to 10,000 URLs instead of the 100,000 URLs per my pricing plan. (crawls were working normally yesterday)
Posted by Fred Harrell over 1 year ago