Page Objects for Web Data Extraction | Umair Ahmed

schedule

Wednesday, April 3, 12:00pm - 1:15pm (EDT)

south_america expand_more Time shown in-04:00 America, New York
search
close

Time zone

am/pm

24h

  • (GMT-11:00)Pacific, Midway
  • (GMT-11:00)Pacific, Niue
  • (GMT-11:00)Pacific, Pago Pago
  • (GMT-10:00)Pacific, Honolulu
  • (GMT-10:00)Pacific, Rarotonga
  • (GMT-10:00)Pacific, Tahiti
  • (GMT-09:30)Pacific, Marquesas
  • (GMT-09:00)America, Adak
  • (GMT-09:00)Pacific, Gambier
  • (GMT-08:00)America, Anchorage
  • (GMT-08:00)America, Juneau
  • (GMT-08:00)America, Metlakatla
  • (GMT-08:00)America, Nome
  • (GMT-08:00)America, Sitka
  • (GMT-08:00)America, Yakutat
  • (GMT-08:00)Pacific, Pitcairn
  • (GMT-07:00)America, Creston
  • (GMT-07:00)America, Dawson
  • (GMT-07:00)America, Dawson Creek
  • (GMT-07:00)America, Fort Nelson
  • (GMT-07:00)America, Hermosillo
  • (GMT-07:00)America, Los Angeles
  • (GMT-07:00)America, Mazatlan
  • (GMT-07:00)America, Phoenix
  • (GMT-07:00)America, Tijuana
  • (GMT-07:00)America, Vancouver
  • (GMT-07:00)America, Whitehorse
  • (GMT-06:00)America, Bahia Banderas
  • (GMT-06:00)America, Belize
  • (GMT-06:00)America, Boise
  • (GMT-06:00)America, Cambridge Bay
  • (GMT-06:00)America, Chihuahua
  • (GMT-06:00)America, Ciudad Juarez
  • (GMT-06:00)America, Costa Rica
  • (GMT-06:00)America, Denver
  • (GMT-06:00)America, Edmonton
  • (GMT-06:00)America, El Salvador
  • (GMT-06:00)America, Guatemala
  • (GMT-06:00)America, Inuvik
  • (GMT-06:00)America, Managua
  • (GMT-06:00)America, Merida
  • (GMT-06:00)America, Mexico City
  • (GMT-06:00)America, Monterrey
  • (GMT-06:00)America, Regina
  • (GMT-06:00)America, Swift Current
  • (GMT-06:00)America, Tegucigalpa
  • (GMT-06:00)Pacific, Easter
  • (GMT-06:00)Pacific, Galapagos
  • (GMT-05:00)America, Atikokan
  • (GMT-05:00)America, Bogota
  • (GMT-05:00)America, Cancun
  • (GMT-05:00)America, Cayman
  • (GMT-05:00)America, Chicago
  • (GMT-05:00)America, Eirunepe
  • (GMT-05:00)America, Guayaquil
  • (GMT-05:00)America, Indiana, Knox
  • (GMT-05:00)America, Indiana, Tell City
  • (GMT-05:00)America, Jamaica
  • (GMT-05:00)America, Lima
  • (GMT-05:00)America, Matamoros
  • (GMT-05:00)America, Menominee
  • (GMT-05:00)America, North Dakota, Beulah
  • (GMT-05:00)America, North Dakota, Center
  • (GMT-05:00)America, North Dakota, New Salem
  • (GMT-05:00)America, Ojinaga
  • (GMT-05:00)America, Panama
  • (GMT-05:00)America, Rankin Inlet
  • (GMT-05:00)America, Resolute
  • (GMT-05:00)America, Rio Branco
  • (GMT-05:00)America, Winnipeg
  • (GMT-04:00)America, Anguilla
  • (GMT-04:00)America, Antigua
  • (GMT-04:00)America, Aruba
  • (GMT-04:00)America, Asuncion
  • (GMT-04:00)America, Barbados
  • (GMT-04:00)America, Blanc-Sablon
  • (GMT-04:00)America, Boa Vista
  • (GMT-04:00)America, Campo Grande
  • (GMT-04:00)America, Caracas
  • (GMT-04:00)America, Cuiaba
  • (GMT-04:00)America, Curacao
  • (GMT-04:00)America, Detroit
  • (GMT-04:00)America, Dominica
  • (GMT-04:00)America, Grand Turk
  • (GMT-04:00)America, Grenada
  • (GMT-04:00)America, Guadeloupe
  • (GMT-04:00)America, Guyana
  • (GMT-04:00)America, Havana
  • (GMT-04:00)America, Indiana, Indianapolis
  • (GMT-04:00)America, Indiana, Marengo
  • (GMT-04:00)America, Indiana, Petersburg
  • (GMT-04:00)America, Indiana, Vevay
  • (GMT-04:00)America, Indiana, Vincennes
  • (GMT-04:00)America, Indiana, Winamac
  • (GMT-04:00)America, Iqaluit
  • (GMT-04:00)America, Kentucky, Louisville
  • (GMT-04:00)America, Kentucky, Monticello
  • (GMT-04:00)America, Kralendijk
  • (GMT-04:00)America, La Paz
  • (GMT-04:00)America, Lower Princes
  • (GMT-04:00)America, Manaus
  • (GMT-04:00)America, Marigot
  • (GMT-04:00)America, Martinique
  • (GMT-04:00)America, Montserrat
  • (GMT-04:00)America, Nassau
  • (GMT-04:00)America, New York
  • (GMT-04:00)America, Port of Spain
  • (GMT-04:00)America, Port-au-Prince
  • (GMT-04:00)America, Porto Velho
  • (GMT-04:00)America, Puerto Rico
  • (GMT-04:00)America, Santiago
  • (GMT-04:00)America, Santo Domingo
  • (GMT-04:00)America, St Barthelemy
  • (GMT-04:00)America, St Kitts
  • (GMT-04:00)America, St Lucia
  • (GMT-04:00)America, St Thomas
  • (GMT-04:00)America, St Vincent
  • (GMT-04:00)America, Toronto
  • (GMT-04:00)America, Tortola
  • (GMT-03:00)America, Araguaina
  • (GMT-03:00)America, Argentina, Buenos Aires
  • (GMT-03:00)America, Argentina, Catamarca
  • (GMT-03:00)America, Argentina, Cordoba
  • (GMT-03:00)America, Argentina, Jujuy
  • (GMT-03:00)America, Argentina, La Rioja
  • (GMT-03:00)America, Argentina, Mendoza
  • (GMT-03:00)America, Argentina, Rio Gallegos
  • (GMT-03:00)America, Argentina, Salta
  • (GMT-03:00)America, Argentina, San Juan
  • (GMT-03:00)America, Argentina, San Luis
  • (GMT-03:00)America, Argentina, Tucuman
  • (GMT-03:00)America, Argentina, Ushuaia
  • (GMT-03:00)America, Bahia
  • (GMT-03:00)America, Belem
  • (GMT-03:00)America, Cayenne
  • (GMT-03:00)America, Fortaleza
  • (GMT-03:00)America, Glace Bay
  • (GMT-03:00)America, Goose Bay
  • (GMT-03:00)America, Halifax
  • (GMT-03:00)America, Maceio
  • (GMT-03:00)America, Moncton
  • (GMT-03:00)America, Montevideo
  • (GMT-03:00)America, Paramaribo
  • (GMT-03:00)America, Punta Arenas
  • (GMT-03:00)America, Recife
  • (GMT-03:00)America, Santarem
  • (GMT-03:00)America, Sao Paulo
  • (GMT-03:00)America, Thule
  • (GMT-03:00)Antarctica, Palmer
  • (GMT-03:00)Antarctica, Rothera
  • (GMT-03:00)Atlantic, Bermuda
  • (GMT-03:00)Atlantic, Stanley
  • (GMT-02:30)America, St Johns
  • (GMT-02:00)America, Miquelon
  • (GMT-02:00)America, Noronha
  • (GMT-02:00)Atlantic, South Georgia
  • (GMT-01:00)America, Nuuk
  • (GMT-01:00)America, Scoresbysund
  • (GMT-01:00)Atlantic, Cape Verde
  • (GMT+00:00)Africa, Abidjan
  • (GMT+00:00)Africa, Accra
  • (GMT+00:00)Africa, Bamako
  • (GMT+00:00)Africa, Banjul
  • (GMT+00:00)Africa, Bissau
  • (GMT+00:00)Africa, Conakry
  • (GMT+00:00)Africa, Dakar
  • (GMT+00:00)Africa, Freetown
  • (GMT+00:00)Africa, Lome
  • (GMT+00:00)Africa, Monrovia
  • (GMT+00:00)Africa, Nouakchott
  • (GMT+00:00)Africa, Ouagadougou
  • (GMT+00:00)Africa, Sao Tome
  • (GMT+00:00)America, Danmarkshavn
  • (GMT+00:00)Atlantic, Azores
  • (GMT+00:00)Atlantic, Reykjavik
  • (GMT+00:00)Atlantic, St Helena
  • (GMT+00:00)UTC
  • (GMT+01:00)Africa, Algiers
  • (GMT+01:00)Africa, Bangui
  • (GMT+01:00)Africa, Brazzaville
  • (GMT+01:00)Africa, Casablanca
  • (GMT+01:00)Africa, Douala
  • (GMT+01:00)Africa, El Aaiun
  • (GMT+01:00)Africa, Kinshasa
  • (GMT+01:00)Africa, Lagos
  • (GMT+01:00)Africa, Libreville
  • (GMT+01:00)Africa, Luanda
  • (GMT+01:00)Africa, Malabo
  • (GMT+01:00)Africa, Ndjamena
  • (GMT+01:00)Africa, Niamey
  • (GMT+01:00)Africa, Porto-Novo
  • (GMT+01:00)Africa, Tunis
  • (GMT+01:00)Atlantic, Canary
  • (GMT+01:00)Atlantic, Faroe
  • (GMT+01:00)Atlantic, Madeira
  • (GMT+01:00)Europe, Dublin
  • (GMT+01:00)Europe, Guernsey
  • (GMT+01:00)Europe, Isle of Man
  • (GMT+01:00)Europe, Jersey
  • (GMT+01:00)Europe, Lisbon
  • (GMT+01:00)Europe, London
  • (GMT+02:00)Africa, Blantyre
  • (GMT+02:00)Africa, Bujumbura
  • (GMT+02:00)Africa, Cairo
  • (GMT+02:00)Africa, Ceuta
  • (GMT+02:00)Africa, Gaborone
  • (GMT+02:00)Africa, Harare
  • (GMT+02:00)Africa, Johannesburg
  • (GMT+02:00)Africa, Juba
  • (GMT+02:00)Africa, Khartoum
  • (GMT+02:00)Africa, Kigali
  • (GMT+02:00)Africa, Lubumbashi
  • (GMT+02:00)Africa, Lusaka
  • (GMT+02:00)Africa, Maputo
  • (GMT+02:00)Africa, Maseru
  • (GMT+02:00)Africa, Mbabane
  • (GMT+02:00)Africa, Tripoli
  • (GMT+02:00)Africa, Windhoek
  • (GMT+02:00)Antarctica, Troll
  • (GMT+02:00)Arctic, Longyearbyen
  • (GMT+02:00)Europe, Amsterdam
  • (GMT+02:00)Europe, Andorra
  • (GMT+02:00)Europe, Belgrade
  • (GMT+02:00)Europe, Berlin
  • (GMT+02:00)Europe, Bratislava
  • (GMT+02:00)Europe, Brussels
  • (GMT+02:00)Europe, Budapest
  • (GMT+02:00)Europe, Busingen
  • (GMT+02:00)Europe, Copenhagen
  • (GMT+02:00)Europe, Gibraltar
  • (GMT+02:00)Europe, Kaliningrad
  • (GMT+02:00)Europe, Ljubljana
  • (GMT+02:00)Europe, Luxembourg
  • (GMT+02:00)Europe, Madrid
  • (GMT+02:00)Europe, Malta
  • (GMT+02:00)Europe, Monaco
  • (GMT+02:00)Europe, Oslo
  • (GMT+02:00)Europe, Paris
  • (GMT+02:00)Europe, Podgorica
  • (GMT+02:00)Europe, Prague
  • (GMT+02:00)Europe, Rome
  • (GMT+02:00)Europe, San Marino
  • (GMT+02:00)Europe, Sarajevo
  • (GMT+02:00)Europe, Skopje
  • (GMT+02:00)Europe, Stockholm
  • (GMT+02:00)Europe, Tirane
  • (GMT+02:00)Europe, Vaduz
  • (GMT+02:00)Europe, Vatican
  • (GMT+02:00)Europe, Vienna
  • (GMT+02:00)Europe, Warsaw
  • (GMT+02:00)Europe, Zagreb
  • (GMT+02:00)Europe, Zurich
  • (GMT+03:00)Africa, Addis Ababa
  • (GMT+03:00)Africa, Asmara
  • (GMT+03:00)Africa, Dar es Salaam
  • (GMT+03:00)Africa, Djibouti
  • (GMT+03:00)Africa, Kampala
  • (GMT+03:00)Africa, Mogadishu
  • (GMT+03:00)Africa, Nairobi
  • (GMT+03:00)Antarctica, Syowa
  • (GMT+03:00)Asia, Aden
  • (GMT+03:00)Asia, Amman
  • (GMT+03:00)Asia, Baghdad
  • (GMT+03:00)Asia, Bahrain
  • (GMT+03:00)Asia, Beirut
  • (GMT+03:00)Asia, Damascus
  • (GMT+03:00)Asia, Famagusta
  • (GMT+03:00)Asia, Gaza
  • (GMT+03:00)Asia, Hebron
  • (GMT+03:00)Asia, Jerusalem
  • (GMT+03:00)Asia, Kuwait
  • (GMT+03:00)Asia, Nicosia
  • (GMT+03:00)Asia, Qatar
  • (GMT+03:00)Asia, Riyadh
  • (GMT+03:00)Europe, Athens
  • (GMT+03:00)Europe, Bucharest
  • (GMT+03:00)Europe, Chisinau
  • (GMT+03:00)Europe, Helsinki
  • (GMT+03:00)Europe, Istanbul
  • (GMT+03:00)Europe, Kirov
  • (GMT+03:00)Europe, Kyiv
  • (GMT+03:00)Europe, Mariehamn
  • (GMT+03:00)Europe, Minsk
  • (GMT+03:00)Europe, Moscow
  • (GMT+03:00)Europe, Riga
  • (GMT+03:00)Europe, Simferopol
  • (GMT+03:00)Europe, Sofia
  • (GMT+03:00)Europe, Tallinn
  • (GMT+03:00)Europe, Vilnius
  • (GMT+03:00)Europe, Volgograd
  • (GMT+03:00)Indian, Antananarivo
  • (GMT+03:00)Indian, Comoro
  • (GMT+03:00)Indian, Mayotte
  • (GMT+03:30)Asia, Tehran
  • (GMT+04:00)Asia, Baku
  • (GMT+04:00)Asia, Dubai
  • (GMT+04:00)Asia, Muscat
  • (GMT+04:00)Asia, Tbilisi
  • (GMT+04:00)Asia, Yerevan
  • (GMT+04:00)Europe, Astrakhan
  • (GMT+04:00)Europe, Samara
  • (GMT+04:00)Europe, Saratov
  • (GMT+04:00)Europe, Ulyanovsk
  • (GMT+04:00)Indian, Mahe
  • (GMT+04:00)Indian, Mauritius
  • (GMT+04:00)Indian, Reunion
  • (GMT+04:30)Asia, Kabul
  • (GMT+05:00)Antarctica, Mawson
  • (GMT+05:00)Antarctica, Vostok
  • (GMT+05:00)Asia, Aqtau
  • (GMT+05:00)Asia, Aqtobe
  • (GMT+05:00)Asia, Ashgabat
  • (GMT+05:00)Asia, Atyrau
  • (GMT+05:00)Asia, Dushanbe
  • (GMT+05:00)Asia, Karachi
  • (GMT+05:00)Asia, Oral
  • (GMT+05:00)Asia, Qyzylorda
  • (GMT+05:00)Asia, Samarkand
  • (GMT+05:00)Asia, Tashkent
  • (GMT+05:00)Asia, Yekaterinburg
  • (GMT+05:00)Indian, Kerguelen
  • (GMT+05:00)Indian, Maldives
  • (GMT+05:30)Asia, Colombo
  • (GMT+05:30)Asia, Kolkata
  • (GMT+05:45)Asia, Kathmandu
  • (GMT+06:00)Asia, Almaty
  • (GMT+06:00)Asia, Bishkek
  • (GMT+06:00)Asia, Dhaka
  • (GMT+06:00)Asia, Omsk
  • (GMT+06:00)Asia, Qostanay
  • (GMT+06:00)Asia, Thimphu
  • (GMT+06:00)Asia, Urumqi
  • (GMT+06:00)Indian, Chagos
  • (GMT+06:30)Asia, Yangon
  • (GMT+06:30)Indian, Cocos
  • (GMT+07:00)Antarctica, Davis
  • (GMT+07:00)Asia, Bangkok
  • (GMT+07:00)Asia, Barnaul
  • (GMT+07:00)Asia, Ho Chi Minh
  • (GMT+07:00)Asia, Hovd
  • (GMT+07:00)Asia, Jakarta
  • (GMT+07:00)Asia, Krasnoyarsk
  • (GMT+07:00)Asia, Novokuznetsk
  • (GMT+07:00)Asia, Novosibirsk
  • (GMT+07:00)Asia, Phnom Penh
  • (GMT+07:00)Asia, Pontianak
  • (GMT+07:00)Asia, Tomsk
  • (GMT+07:00)Asia, Vientiane
  • (GMT+07:00)Indian, Christmas
  • (GMT+08:00)Antarctica, Casey
  • (GMT+08:00)Asia, Brunei
  • (GMT+08:00)Asia, Choibalsan
  • (GMT+08:00)Asia, Hong Kong
  • (GMT+08:00)Asia, Irkutsk
  • (GMT+08:00)Asia, Kuala Lumpur
  • (GMT+08:00)Asia, Kuching
  • (GMT+08:00)Asia, Macau
  • (GMT+08:00)Asia, Makassar
  • (GMT+08:00)Asia, Manila
  • (GMT+08:00)Asia, Shanghai
  • (GMT+08:00)Asia, Singapore
  • (GMT+08:00)Asia, Taipei
  • (GMT+08:00)Asia, Ulaanbaatar
  • (GMT+08:00)Australia, Perth
  • (GMT+08:45)Australia, Eucla
  • (GMT+09:00)Asia, Chita
  • (GMT+09:00)Asia, Dili
  • (GMT+09:00)Asia, Jayapura
  • (GMT+09:00)Asia, Khandyga
  • (GMT+09:00)Asia, Pyongyang
  • (GMT+09:00)Asia, Seoul
  • (GMT+09:00)Asia, Tokyo
  • (GMT+09:00)Asia, Yakutsk
  • (GMT+09:00)Pacific, Palau
  • (GMT+09:30)Australia, Adelaide
  • (GMT+09:30)Australia, Broken Hill
  • (GMT+09:30)Australia, Darwin
  • (GMT+10:00)Antarctica, DumontDUrville
  • (GMT+10:00)Antarctica, Macquarie
  • (GMT+10:00)Asia, Ust-Nera
  • (GMT+10:00)Asia, Vladivostok
  • (GMT+10:00)Australia, Brisbane
  • (GMT+10:00)Australia, Hobart
  • (GMT+10:00)Australia, Lindeman
  • (GMT+10:00)Australia, Melbourne
  • (GMT+10:00)Australia, Sydney
  • (GMT+10:00)Pacific, Chuuk
  • (GMT+10:00)Pacific, Guam
  • (GMT+10:00)Pacific, Port Moresby
  • (GMT+10:00)Pacific, Saipan
  • (GMT+10:30)Australia, Lord Howe
  • (GMT+11:00)Asia, Magadan
  • (GMT+11:00)Asia, Sakhalin
  • (GMT+11:00)Asia, Srednekolymsk
  • (GMT+11:00)Pacific, Bougainville
  • (GMT+11:00)Pacific, Efate
  • (GMT+11:00)Pacific, Guadalcanal
  • (GMT+11:00)Pacific, Kosrae
  • (GMT+11:00)Pacific, Norfolk
  • (GMT+11:00)Pacific, Noumea
  • (GMT+11:00)Pacific, Pohnpei
  • (GMT+12:00)Antarctica, McMurdo
  • (GMT+12:00)Asia, Anadyr
  • (GMT+12:00)Asia, Kamchatka
  • (GMT+12:00)Pacific, Auckland
  • (GMT+12:00)Pacific, Fiji
  • (GMT+12:00)Pacific, Funafuti
  • (GMT+12:00)Pacific, Kwajalein
  • (GMT+12:00)Pacific, Majuro
  • (GMT+12:00)Pacific, Nauru
  • (GMT+12:00)Pacific, Tarawa
  • (GMT+12:00)Pacific, Wake
  • (GMT+12:00)Pacific, Wallis
  • (GMT+12:45)Pacific, Chatham
  • (GMT+13:00)Pacific, Apia
  • (GMT+13:00)Pacific, Fakaofo
  • (GMT+13:00)Pacific, Kanton
  • (GMT+13:00)Pacific, Tongatapu
  • (GMT+14:00)Pacific, Kiritimati

Suggestions

Your search did not return any results.

    This talk will first elaborate on the concept of Page objects based on Martin Fowler's idea that was initially introduced for automating the testing of web pages. Then the new idea developed by Zyte to use Page objects for web scraping will be introduced and the motivations behind this idea will be discussed which are namely:

    • Pluggable: You create a simple generic scraper and just plug the Page object in and it works
    • Portable: Page objects must be easily transferable as a Python package and adopted by any Scrapy project or other Python project
    • Reusable: The same Page Object could be used by many different projects. It should be easily adopted by any existing web scraping project

    Then, the talk will introduce the open-source Python package web-poet developed by Zyte for using page objects for web data extraction. The idea of page objects will be elaborated with code snippets and various features of the package will be discussed as well as the APIs that it offers for developers

    Finally, the talk will conclude with some examples of using the web-poet package with Scrapy, the most popular Python framework for web scraping. This section will introduce Zyte's open-source Python package scrapy-poet for using the page objects technique with the Scrapy framework specifically

    Zoom Link - https://zyte.zoom.us/j/85161132569?pwd=bd5xbyVf3tXnP81wbpH3nz1WU1j5E4.1
    Add to Calendar 2024/04/03 16:00:00 2024/04/03 17:15:00 UTC Page Objects for Web Data Extraction | Umair Ahmed This talk will first elaborate on the concept of Page objects based on Martin Fowler's idea that was initially introduced for automating the testing of web pages. Then the new idea developed by Zyte to use Page objects for web scraping will be introduced and the motivations behind this idea will be discussed which are namely:

    • Pluggable: You create a simple generic scraper and just plug the Page object in and it works
    • Portable: Page objects must be easily transferable as a Python package and adopted by any Scrapy project or other Python project
    • Reusable: The same Page Object could be used by many different projects. It should be easily adopted by any existing web scraping project

    Then, the talk will introduce the open-source Python package web-poet developed by Zyte for using page objects for web data extraction. The idea of page objects will be elaborated with code snippets and various features of the package will be discussed as well as the APIs that it offers for developers

    Finally, the talk will conclude with some examples of using the web-poet package with Scrapy, the most popular Python framework for web scraping. This section will introduce Zyte's open-source Python package scrapy-poet for using the page objects technique with the Scrapy framework specifically

    Zoom Link - https://zyte.zoom.us/j/85161132569?pwd=bd5xbyVf3tXnP81wbpH3nz1WU1j5E4.1
    https://zyte.zoom.us/j/85161132569?pwd=bd5xbyVf3tXnP81wbpH3nz1WU1j5E4.1 false MM/DD/YYYY 30 OPAQUE aSqDanUFCzjkiKPYzmAO97606
    location_on

    https://zyte.zoom.us/j/85161132569?pwd=bd5xbyVf3tXnP81wbpH3nz1WU1j5E4.1

    person

    Zyte, media@scrapinghub.com