1

There is this webpage I'm trying to scrape but it is behind a login form that needs user/password.
I used Firefox debug mode to find HTTP requests. Upon entering the credentials, there is a POST request which return a set-cookie in Response Headers. There is subsequent GET that uses this cookie (instead of user/pass) to finally access the actual page. The cookie has an expiration date.

I assume I have to 1- get that cookie via curl POST dumping it in a cookie.txt, 2- then again use curl GET with -b cookie.txt option (without needing to pass along user/pass).

I want to fetch that set-cookie via curl in command line (step 1).
I tried to replicate what Firefox did with Copy as cURL command on POST. It forms the curl request (with user/pass inside a --data-raw switch and a short ASP.NET_SessionId inside a -H 'Cookie: ....
Problem is, it does return a header but cookie is not there. The downloaded website (HTML) indicated something went wrong.
Now, in forming up the --data-raw, there are fields like __VIEWSTATE, __VIEWSTATEGENERATOR and __EVENTVALIDATION which seem to change on each GET of the login page. I filled them up via a little scrpting. It didn't fix the issue.

I'm using the copied curl command from firefox alongside -c cookie.txt -D - -v ...It does not print the cookie in headers or dump the cookie in the file.

Zeta.Investigator
  • 880
  • 1
  • 7
  • 25
  • this looks similar to this question https://stackoverflow.com/a/45755598/1119949 – imbuedHope Aug 30 '21 at 21:18
  • @imbuedHope I've seen that post. That answer is pretty much what I'm doing (which doesn't work). Also, for this website, I used `--data-raw` to include my user/pass and didn't explicitly use `--form`s. – Zeta.Investigator Aug 30 '21 at 21:43

0 Answers0