Nanny tutorial HTML 20000 word note summary [suggestions collection] (Part I)

       👇
👉🚔 Jump straight to the end 🚔 👈 ——> Receive exclusive fan benefits 💖
       ☝️

  👻 Last blog post Twenty thousand word blog post teaches you the python crawler requests library. I won't give you all my girlfriends after reading it[ ❤️ Stay up late to tidy up & suggestions collection ❤️] After being read by many crawler lovers / friends who want to learn crawlers, many friends wrote to me - I can climb, but what I climb down is the web source code data. I really can't understand duck! What should I do? 👻

  😬 (wry smile) in order to make our friends have a better understanding of the page parsing library they will learn in the future, the blogger worked overtime all night and worked hard to prepare this article (divided into two parts!) to explain the common front-end knowledge for everyone in depth and comprehensively - what technology to learn starts from the bottom floor, and it starts from the flat ground of ten thousand tall buildings, which is also based on the stable foundation! So this article (there are two articles in total!) it is a step ahead of the page parsing library to summarize to the friends! 😬

  😜 In these two blog posts, the blogger led his friends to seriously learn the necessary knowledge related to HTML (Hypertext Markup Language) (how much do you have to understand the page structure after the crawler climbs to the data to parse the page data!) - html is a markup language. Markup language is composed of a set of markup tags. Learning html is learning tags.! 😜

Here comes the point! Here comes the point!! 💗💗💗

  I believe many of my friends have passed my First blog post Completely understand the requests library, and the HTML knowledge explained in this (and the next) article will enable you to analyze the page data you climb in the future~

Knowledge point supply station:
If you compare a web page to a person, HTML Equivalent to a skeleton, JavaScript Equivalent to muscle, CSS Equivalent to skin. The three are combined
 To form a perfect web page.

The first concept to be popularized is a web page component:
	①HTML: Used to define what content is in the web page;
	②CSS:Typesetting the content;
		(1)Find the content to be typeset - how to find the label to be typeset;
		(2)Set style - which styles can be set for labels.
	③JavaScript:For dynamic control pages.

Part I: introduction to HTML framework

1. What & how to learn & what tools to use

(1) What is HTML?

  1. HTML: (HyperText Markup Language)
      HTML in a narrow sense refers to web pages;
    The official account of HTML refers to html,css,js and various frameworks, such as web pages, mobile web pages, small programs, public numbers, app of mobile terminals, fast applications, etc.

  2. Details:
       hypertext: hypertext -- beyond the scope of text, simply speaking, it can be not only text, but also images, audio, video, flash, etc;
      Markup: Tags - there are many tags in web pages. Different tags have different meanings and functions. Tags are also called tags. Html contains a variety of tags. These tags can't be scrawled. They must be written in the w3c specification.

  3. Examples of labels:
      ① < body > < / body > -- double label
       ② < br > -- single label

(2) How to learn HTML?

  HTML is a markup language, which is composed of a set of markup tags. Learning HTML is learning tags.

(3) Tools used:

   editor: python (those who do Python must use Python!)
  browser: Firefox, Chrome (these two are recommended!)

2. Basic structure of HTML

3. Specification of HTML file

  1. html file starts with and ends with, and any other tags need to be written between and;
  2. The Html tag contains only two sub Tags: head and body;
  3. The content related to web page settings is written in the head tag;
  4. The contents to be displayed are written in the body tag;

4. Basic template of HTML

If you create a new html file in pycharm, it will look like this!

<!DOCTYPE html>						#Declaration of document type
<html lang="en">					#The root tag, which is the beginning of the document. lang (Language), en(english). The declaration language type is English
<head>								#Web page header
    <meta charset="UTF-8">			#The international encoding meta configures the type of character set encoding 
    <title>Title</title>			#Page title
</head>
<body>								#Main body of web page, visualization area

</body>
</html>

Knowledge point supply station:

  1. <! DOCTYPE html > tag: function: declare at the front of the html file, define the document type, and tell the browser to parse the document with the html specification.
  2. When writing HTML files in pycharm, <! -- the content to be annotated -- > is a comment, and the shortcut key is Ctrl + /.
  3. When writing HTML files in pycharm, you can directly mark and sign when labeling, and then Tab can realize automatic completion. For example, enter P, and then Tab will automatically complete as: < p ></p>

Part II: labels

  1. What is a label?
	It consists of words wrapped in angle brackets, such as:<html>,So the label cannot start with a number.
  1. Labels are not case sensitive, but lower case is recommended.
  2. Labels can be nested, but not cross nested.
  3. Tags are also called elements. For example, inline tags can also be inline elements.
Examples of errors:<a><b></a></b>
Correct example:<a><b></b></a>

1. Use style and attribute of label

(1) Usage style of label:

  1. Start label is also called open label < a > label body < / a > end label is also called closed label or closed label
  2. Self closing label / single label, such as: < meta charset = "UTF-8" >, < br >, etc
	There are two ways to write a single label:
			Writing method 1: just open the label<br>
			Writing method 2: write one at the end of the opening label/,as<br/>

(2) Label properties:

  • It usually appears in the form of key value pairs, for example: < meta charset = "UTF-8" > charset is the attribute of the tag, and the corresponding single quotation mark or double quotation mark is called the attribute value;
  • Attributes can only appear in the start tag and close tag, but not in the end tag;
  • Attribute names are all lowercase, and attribute values must be wrapped in single quotation marks or double quotation marks;
  • If the attribute name is exactly the same as the attribute value, write the attribute name directly, such as "readonly" (input tag attribute)

(3) Block label

Part I knowledge points - Characteristics of block labels:

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Characteristics of block labels</title>
</head>
<body>

	<!--Characteristics of block labels:(By paragraph label p Explain)-->
	<!--1.The effective width and height can be set, and the outer and inner margins can be controlled-->
	<!--2.Without setting the width, the width is always consistent with the parent label, independent of the content. It is 100% of the parent label container%;-->
	<!--3.A paragraph label can occupy one line no matter how much content it contains-->
	<!--4.When multiple block labels are written together, the default arrangement is top-down-->
	<!--5.Can accommodate inline elements and other block elements-->
	<p style="width:100px; height:100px">This is a paragraph label. I'm a block label</p>

</body>
</html>
Knowledge point supply station: 1.px is the pixel and the length unit; 2. The width * height displayed in the web page view ELements (ELements).

Part II knowledge points - commonly used block labels:

Little knowledge: the default font size of the web page is 1em (1em=16px)!

There are four in total:

  1. Title label;
	<!--First: title label. h1 reach h6  (Generally placed in the title of the article) the font will be bold!-->
	<!-- h1 The font size is 2 em--32px;h2 The font size is 1.5em--24px;h3 The font size is 1.17em--18.72px;h4 The font size is the default font size 16 px;
		h5 The font size is 0.83em--13.28px;h6 The font size is 0.67em--The conversion should be 10.72px,However, the web page has a minimum font size limit, so it will become 12 px!   -->
	<h1>I am a first-class tag. A web page can only have one, and others can have multiple</h1>
	<h2>I'm a secondary label</h2>
  1. Paragraph labels;
	<!--Second: paragraph label. The size is the same as that of the four level title label HTML Document split into paragraphs)-->
	<!-- p By default, labels have spacing before and after segments--16px;But paragraph labels don't indent the first line! -->
	<p>I'm a paragraph label</p>
  1. List label: it is divided into three categories: sequential list, unordered list and definition list;
	<!--Third: List label-->
	
		<!--(1)Ordered list-->
		<ol type="A" start="3">        <!--ol yes order list Abbreviation for. This line is the beginning of a sequence table   
		type There are five options - 1:Press 1,2,3,4...Display serial number (default);
					  A:Press A,B,C,D..Display serial number;
					  a: Press a,b,c,d..Display serial number;
					  I: Press I,II,III,IV,V..Display serial number (Roman numerals);
					  i: Press i,ii,iii,iv,v..Displays the sequence number (Roman numerals, lowercase).
									  start Specify which row to start from-->
		    <li>This is a sequence table 1</li>
		    <li>This is a sequence table 2</li>
		    <li>This is a sequence table 3</li>
		</ol>
		
		<!--(2)Unordered list-->
		<ul type="circle">             <!--ul yes unorder list Abbreviation for. This line is the beginning of an unordered list
		Can set none((empty),circle(Hollow circle),square(Solid block), disc(Solid (circle)[Default value]Equal style-->
		    <li>This is unordered list 1</li>	   <!-- One li Represents a list item. -->
		    <li>This is unordered list 2</li>
		    <li>This is unordered list 3</li>
		    <li>This is an unordered list 4</li>
		</ul>
		
		<!--(3)Definition list-->
		<dl>       					 <!--This line is the beginning of the definition list-->
		    <dt>Fruits</dt>        	 <!--Represents a large column item-->
		    <dd>Grape</dd>             <!--Represents the interpretation of the above items, i.e. segmentation.-->
		    <dd>Durian</dd>			 <!-- dd With indentation, dt No indent -->
		
		    <dt>Vegetables</dt>
		    <dd>Cauliflower</dd>
		    <dd>Cabbage</dd>
		</dl>
  1. div tag.
	<!--Fourth: div Label: used to divide areas one by one
			         (width and height To specify the area size; background-color Yes (set area background color)-->
	<!--div Is a pure block element - pure means that there is no default style. It should not be used too much. Poor later maintenance-->
	<!-- div Biggest advantage: layout, as a container, carries other labels because div There is no default style, so just use it div Wrapped in labels,
					  But it does not affect the display of labels. -->
	<div style="width:500px;height: 500px;background-color: #66a9fe; "> I am a div < / div >
Knowledge point supply station:

   if it's too troublesome to print list labels, you can use the shortcut: (n represents the number of corresponding labels. Directly enter the following statement Tab to generate them!)

In depth explanation: emmet syntax -- quick tag syntax of quick code tapping tips!

  1. *It is the function of multiplication, followed by a number. If the number is a few, several labels will be generated!
    give an example:

    realization:

  2. $can represent a number. This number starts from 1 and increases gradually. It is usually used with *.
    give an example:

    realization:

  3. {} is used to write the text content of the label.
    give an example:

    realization:

  4. [] is used to write the attribute name and attribute value (if no attribute value is added, the attribute with empty attribute value will be created)
    Example ①:

    Example ②:

    Realization ①:

    Realization ②:

  5. >Used to indicate the next level of labels, which constitute a parent-child relationship (inclusive relationship)
    give an example:

    explain:

  6. +Generate a peer label (sibling) after the current label
    give an example:

    realization:

  7. #A generates a div tag with id bit a

    give an example:

    realization:

  8. Comprehensive use:
    Example ①:

    Realization ①:

    Example ②:

    Realization ②:

    Example ③:

    Realization ③:

    Example ④:

    Realization ④:

    Example ⑤:

    Realization ⑤:

(4) Inline label (inline label)

Part I knowledge points - Characteristics of inline Tags:

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Inline tags, also known as inline tags</title>
</head>
<body>

	<!--Features of inline labels (via text labels) span (description)-->
	<!--1.The width and height settings are invalid, and the outer and inner margins cannot be controlled-->
	<!--2.The width is the width of the text or picture, which cannot be controlled-->
	<!--3.There will be no automatic line feed. A line will be dropped only when the line is full-->
	<!--4.When multiple inline labels are written together, the default arrangement is left to right-->
	<!--5.Inline elements can only hold text or other inline elements-->
	<span>I'm a text tag</span>

</body>
</html>

Part II knowledge points - Common inline labels:

There are four types:

  1. Picture label
	<!--First: picture label  img + Tab-->
	<img src="" alt="" width="" height="">   <!--src Is the address of the picture, which can be directly the picture address in the web page,
											It can also be the address of the local picture (relative path is recommended - relative meaning: picture relative to Html The location of the document and the reference is html File!-->
		                      <!-- alt Is the content displayed when the picture fails to load-->
							  <!-- width Is the width of the picture; height Is the height of the picture. Do not specify the size of the original image. Note: if only one is specified, the other will be scaled equally -->
		                      <!-- img Not a block element(block),It's not a row element(inline),It is a row level block element( inline-block)  -->
  1. Bold / Italic labels
	<!--Second: Bold/Italic label   b+ Tab  i+ Tab -->
	<b>I'm bold</b>
	<i>I'm in italics</i>
  1. Hyperlink label
	<!--Third: hyperlink label     _self Open on its own web page;_blank Open a new page-->
	<a href="" title="A description of the hyperlink (text displayed when the mouse is over)" target="_self">I'm a hyperlink</a>   					
						   <!--href You can write the web address and file path-->
						   <!-- If href="" Then click to refresh the current page and return to the top -->
						   <!-- If href="#"After clicking, it will return to the top, but the page will not be refreshed -- >
						   <!-- If href="#Anchor point "after clicking, it will jump to the specified anchor point (the anchor point is actually an id value)! -- >
  1. Text label
	<!--Fourth: text labels and css It works only when used together!-->
	<!-- span Is a pure line element; the so-called purity - there is no default style; -->
	<!-- span The biggest advantage: set the style, which is mainly used for line elements or text -->
	<span>I'm a text tag</span>
First station of knowledge point supply station:

If the picture label is filled with the address of the local picture (relative path):
   1. If the picture is in the same level directory, directly: current directory name / picture name
Or:. / current directory name / picture name
   2. If the picture is in the parent directory, directly:.. / picture name

Second station of knowledge point supply station:


As for why we need to realize the conversion between inline tags and block tags, our friends will get this point when setting css style.
    for example, block labels cannot be placed on one line, but we can convert them into inline labels first!!!

In The End!

From now on, stick to it and make progress a little bit a day. In the near future, you will thank you for your efforts!

  the blogger will continue to update the crawler basic column and crawler actual column (in order to better analyze the page, some front-end necessary knowledge points will also be updated!). After carefully reading this article, the friends can praise the collection and comment on your feeling after reading. They can also pay attention to the blogger and read more crawler articles in the future!

	If there are mistakes or inappropriate words, you can point them out in the comment area. Thank you!
	If you reprint this article, please contact me to explain the meaning and mark the source and the blogger's name. Thank you!

 

👇🏻 You can add by clicking - > below Private VX number 👇🏻

[please indicate that you are from CSDN, which will pull you into the technical exchange group (the group involves leaders in various fields, and any questions can be discussed ~) --- > learn from each other & & make progress together (if you are the one)]

Tags: Python html crawler

Posted on Sun, 10 Oct 2021 18:39:39 -0400 by bseven