04-02 /spider dir

1 bookspider.py

This is the main spider.py script, whatever you name it

import scrapy


class BookspiderSpider(scrapy.Spider):
    name = "bookspider"
    allowed_domains = ["books.toscrape.com"]
    start_urls = ["https://books.toscrape.com"]

    def parse(self, response):
        pass

allowed_domains property(not sure what python calls it), contains the spider to the domain

parse() is the funtion that runs with the response is returned

This is the script where we put css selectors we find (seen in next lesson) using the shell