subject

You have been tasked with building a URL file validator for a web crawler. A web crawler is an application that fetches a web page, extracts the URLs present in that page, and then recursively fetches new pages using the extracted URLs. The end goal of a web crawler is to collect text data, images, or other resources present in order to validate resource URLs or hyperlinks on a page. URL validators can be useful to validate if the extracted URL is a valid resource to fetch. In this scenario, you will build a URL validator that checks for supported protocols and file types. What you need to do?
1. Writing detailed comments and docstrings
2. Organizing and structuring code for readability
3. URL = :///
Steps for Completion
Task
Create two lists of strings - one list for Protocol called valid_protocols, and one list for storing File extension called valid_ftleinfo . For this take the protocol list should be restricted to http , https and ftp. The file extension list should be hrl. and docx CSV.
Split an input named url, and then use the first element to see whether the protocol of the URL is in valid_protocols. Similarly, check whether the URL contains a valid file_info.
Task
Write the conditions to return a Boolean value of True if the URL is valid, and False if either the Protocol or the File extension is not valid.
main. py х +
1 def validate_url(url):
2 Validates the given url passed as string.
3
4 Arguments:
5 url --- String, A valid url should be of form :///
6
7 Protocol = [http, https, ftp]
8 Hostname = string
9 Fileinfo = [.html, .csv, .docx]
10 ***
11 # your code starts here.
12
13
14
15 return # return True if url is valid else False
16
17
18 if
19 name _main__': url input("Enter an Url: ")
20 print(validate_url(url))
21
22
23
24
25

ansver
Answers: 3

Another question on Computers and Technology

question
Computers and Technology, 22.06.2019 19:20
Consider the following code snippet: #ifndef cashregister_h#define cashregister_hconst double max_balance = 6000000.0; class cashregister{public: cashregister(); cashregister(double new_balance); void set_balance(double new_balance); double get_balance() const; private: double balance[12]; }; double get_monthly_balance(cashregister bk, int month); #endifwhich of the following is correct? a)the header file is correct as given.b)the definition of max_balance should be removed since header files should not contain constants.c)the definition of cashregister should be removed since header files should not contain class definitions.d)the body of the get_monthly_balance function should be added to the header file.
Answers: 1
question
Computers and Technology, 22.06.2019 20:30
In this lab, you complete a prewritten c program that calculates an employee’s productivity bonus and prints the employee’s name and bonus. bonuses are calculated based on an employee’s productivity score as shown below. a productivity score is calculated by first dividing an employee’s transactions dollar value by the number of transactions and then dividing the result by the number of shifts worked.
Answers: 3
question
Computers and Technology, 22.06.2019 22:00
During physical science class ben and jerry connected three identical lightbulbs in parallel to a battery where happens when ben removes one of the lightbulbs from it’s socket
Answers: 2
question
Computers and Technology, 23.06.2019 01:30
How do you set up a slide show to play continuously, advancing through all the slides without requiring your interaction? a. click set up slide show, and then select the loop continuously until ‘esc' and show without narration options. b. click set up slide show, and then select the loop continuously until ‘esc' and use timings, if present options. c. click set up slide show, and then select the show presenter view and use timings, if present options. d. click set up slide show, and then select the show without animation and browsed at a kiosk (full screen) options.
Answers: 3
You know the right answer?
You have been tasked with building a URL file validator for a web crawler. A web crawler is an appli...
Questions
question
Mathematics, 22.04.2021 16:40
question
Arts, 22.04.2021 16:40
question
Mathematics, 22.04.2021 16:40
question
Mathematics, 22.04.2021 16:40
question
Mathematics, 22.04.2021 16:40
question
Mathematics, 22.04.2021 16:40
Questions on the website: 13722361