Learning notes of C programming language -- UNIX system interface

UNiX system

UNIX operating system provides services through a series of system calls. These system calls are actually functions in the operating system, which can be called by user programs.

File descriptor

In UNIX operating system, all peripheral devices (including keyboard and display) are regarded as files in the file system. Therefore, all input / output should be completed by reading or writing files.
That is, all direct communication between peripheral devices and programs can be handled through a single interface.

Unix File descriptors are used on to refer to files.
The file descriptor is an integer
0/1/2 Represent standard input, output and error respectively

The user of the program can redirect the program's I/O through < and >

prog <Enter file name>Output file name

Low level I/O - read and write

Input and output are implemented through read and write system calls.
In these two functions, the first parameter is the file descriptor, the second parameter is the character array storing read or write data in the program, and the third parameter is the number of bytes to be transmitted.

int n_read = read(int fd, char *buf, int n);
int n_written = written(int fd, char *buf, int n);

Each call returns the number of bytes actually transferred.

A version of the getchar function that implements unbuffered input by reading one character at a time from standard input.

#include "syscalls.h"
/* getchar Functions: unbuffered single character input */
int getchar(void)
	char c;
	return (read(0, &c, 1) == 1 ) ? (unsigned char) c : EOF;

The second version of getchar reads in a set of characters at a time, but outputs only one character at a time.

#include "syscalls.h"
/* getchar Function: simple buffered version */
int getchar(void)
	static char buf[BUF	SIZ];
	static char *bufp = buf;
	static int n = 0;
	if (n == 0) 
		/* The buffer is empty */
		n = read(0, buf, sizeof buf);
		bufp = buf;
	return (--n >= 0) ? (unsigned char) *bufp++ : EOF;

open, creat e, close and unlink

Except for the default standard input, standard output, and standard error files, all other files must be displayed and opened before reading or writing. The system calls open and creat e are used to implement this function.

Difference between open and fopen:
The former returns a file descriptor, which is just a number of type int, while the latter returns a file pointer. If an error occurs, open returns - 1.

#include <fcntl.h>
int fd;
int open(char *name, int flags, int perms);

fd = open(name, flags, perms);

Like fopen, the parameter name is a string containing the file name. The second parameter flags is a value of type int, which describes how to open the file. The main values are as follows:

O_RDONLY		Open file as read-only
O_WRONLY		Open file write only
O_RDWR			Open file read-write

Opening a non-existent file with open will result in an error. You can use the create system call to create a new file or overwrite an existing old file, as shown below:

int creat(char *name, int perms);

fd = creat(name, perms);

If the file is successfully created, it returns a file descriptor, otherwise - 1.
If the file already exists, create will truncate the length of the file to 0 and discard the existing content. Using create to create an existing file does not cause an error.

If the file to be created does not exist, create creates the file with the permissions specified by the parameter perms.

The standard library function vprintf function is similar to the printf function, except that it replaces the variable length parameter table with a parameter, which is called va_start macro to initialize.
Similarly, the vprintf and vsprintf functions are similar to the fprintf and sprintf functions, respectively.

#include <stdio.h>
#include <stdarg.h>
/* error Function: prints an error message and terminates */
void error(char *fmt, ...)
	va_list args;
	va_start(args, fmt);
	fprintf(stderr, "error: ");
	vprintf(stderr, fmt, args);
	fprintf(stderr, "\n");

The function unlink(char*name) deletes the file name from the file system, which corresponds to the standard library function remove.

Random access - lseek

Input / output is usually sequential: each time read and write are called, the position of read and write is immediately after the position of the previous operation.

Calling lseek can move anywhere in the file without actually reading or writing any data:

long lseek(int fd, long offset, int origin);

The following function will read any number of bytes from any location in the file. It returns the number of bytes read. If an error occurs, it returns - 1.

long lseek(int fd, long offset, int origin);
#include "syscalls.h"
/*get Function: read n bytes from pos position */
int get(int fd, long pos, char *buf, int n)
	if (lseek(fd, pos, 0) >= 0) /* Move to position pos */
		return read(fd, buf, n);
		return -1;

The standard library function fseek is similar to the system call lseek, except that the first parameter of the former is of type FILE * and returns a non-zero value when an error occurs.

Example -- implementation of fopen and getc functions

#define NULL 0
#define EOF (-1)
#define BUFSIZ 1024
#define OPEN_MAX 20 / * maximum number of files that can be opened at one time*/
typedef struct _iobuf 
	int cnt; /* Number of characters remaining */
	char *ptr; /* Position of the next character */
	char *base; /* Location of the buffer */
	int flag; /* File access mode */
	int fd; /* File descriptor */

extern FILE _iob[OPEN_MAX];
#define stdin (&_iob[0])
#define stdout (&_iob[1])
#define stderr (&_iob[2])
enum _flags 
	_READ = 01, /* Open file read */
	_WRITE = 02, /* Open file as write */
	_UNBUF = 04, /* Do not buffer files */
	_EOF = 010, /* Reached the end of the file */
	_ERR = 020 /* An error occurred in the file*/

int _fillbuf(FILE *);
int _flushbuf(int, FILE *);
#define feof(p) ((p)->flag & _EOF) != 0)
#define ferror(p) ((p)->flag & _ERR) != 0)
#define fileno(p) ((p)->fd)
#define getc(p) (--(p)->cnt >= 0 \
			? (unsigned char) *(p)->ptr++ : _fillbuf(p))
#define putc(x,p) (--(p)->cnt >= 0 \
			? *(p)->ptr++ = (x) : _flushbuf((x),p))
#define getchar() getc(stdin)
#define putcher(x) putc((x), stdout)

The main function of fopen function is to open the file, locate the appropriate location, and set the flag bit to prompt the corresponding status.
It does not allocate any buffer space, which is allocated by the function when the file is first read_ fillbuf completed.

#include <fcntl.h>
	#include "syscalls.h"
	#define PERMS 0666 /* R owner, owner combination and other members can read and write*/
	/* fopen Function: opens a file and returns a file pointer */
	FILE *fopen(char *name, char *mode)
		int fd;
		FILE *fp;
		if (*mode != 'r' && *mode != 'w' && *mode != 'a')
			return NULL;
		for (fp = _iob; fp < _iob + OPEN_MAX; fp++)
			if ((fp->flag & (_READ | _WRITE)) == 0)
				break; /* Find a free bit */
		if (fp >= _iob + OPEN_MAX) /* No free location */
			return NULL;

		if (*mode == 'w')
			fd = creat(name, PERMS);
		else if (*mode == 'a') 
			if ((fd = open(name, O_WRONLY, 0)) == -1)
				fd = creat(name, PERMS);
			lseek(fd, 0L, 2);
			fd = open(name, O_RDONLY, 0);
		if (fd == -1) /* Cannot access name */
			return NULL;
		fp->fd = fd;
		fp->cnt = 0;
		fp->base = NULL;
		fp->flag = (*mode == 'r') ? _READ : _WRITE;
		return fp;

For a specific file, the count value is 0 when the getc function is called for the first time, so the function must be called once_ fillbuf.
If_ fillbuf finds that the file is not opened in read-write mode, it will immediately return to EOF; Otherwise, it will attempt to allocate a buffer (if the read operation is buffered).

After the buffer is created_ Fillbuf calls read to fill the buffer, set the count value and pointer, and return the first character in the buffer. Subsequent_ The fillbuf call will find the buffer and the allocation.

#include "syscalls.h"
	/* _fillbuf Functions: allocate and fill input buffers */
	int _fillbuf(FILE *fp)
		int bufsize;
		if ((fp->flag&(_READ|_EOF_ERR)) != _READ)
			return EOF;
		bufsize = (fp->flag & _UNBUF) ? 1 : BUFSIZ;
		if (fp->base == NULL) /* Buffer has not been allocated */
			if ((fp->base = (char *) malloc(bufsize)) == NULL)
				return EOF; /* Cannot allocate buffer*/
		fp->ptr = fp->base;
		fp->cnt = read(fp->fd, fp->ptr, bufsize);
		if (--fp->cnt < 0) 
			if (fp->cnt == -1)
				fp->flag |= _EOF;
				fp->flag |= _ERR;
			fp->cnt = 0;
			return EOF;
		return (unsigned char) *fp->ptr++;

The last thing is how to execute these functions. We must define and initialize arrays_ stdin, stdout, and stderr values in iob:

FILE _iob[OPEN_MAX] = 
		/* stdin, stdout, stderr */
		{ 0, (char *) 0, (char *) 0, _READ, 0 },
		{ 0, (char *) 0, (char *) 0, _WRITE, 1 },
		{ 0, (char *) 0, (char *) 0, _WRITE | _UNBUF, 2 }

The initial value of the flag part in the structure indicates that the read operation will be performed on stdin, the write operation will be performed on stdout, and the buffer write operation will be performed on stderr.

Instance - directory list

A directory is a file that contains a list of file names and file locations.
The file location is indicated by index numbers.
Directory entries typically contain only: the filename and an inode number.
fileiThe node stores the file information except the file name

In order to separate the non portable parts, we divide the task into two parts.
The outer layer defines a structure called Dirent and three functions opendir, readdir and closedir, which provide system independent access to the name and i node number in the directory item.

The structure Dirent contains i node number and file name. The maximum length of a file name is determined by NAME_MAX settings, name_ The value of Max is determined by the system.
opendir returns a pointer to a structure called DIR, which is similar to the structure FILE and will be used by readdir and closedir.
All this information is stored in the header file dirent.h.

// dirent.h.
	#define NAME_MAX 14 / * maximum file name: determined by the specific system*/
	/* system-dependent */
	typedef struct 
		/* Portable catalog entries*/
		long ino; /* i Node number */
		char name[NAME_MAX+1]; /* File name with terminator '\ 0' */
	typedef struct 
		/* Minimum DIR: no buffer, etc */
		int fd; /* File descriptor for directory */
		Dirent d; /* Catalog entry */
	DIR *opendir(char *dirname);
	Dirent *readdir(DIR *dfd);
	void closedir(DIR *dfd);

The system calls stat to take the file name as a parameter and return all the information in the i node of the file; If there is an error, - 1 is returned.
As follows:

	char *name;
	struct stat stbuf;
	int stat(char *, struct stat *);
	stat(name, &stbuf);

It fills the structure stbuf with the i-node information of the file name. The header file < sys / stat.h > contains the return value structure describing stat.
A typical form of this structure is as follows:

	struct stat /* i node information returned by stat */
		dev_t st_dev; /* i Node equipment */
		ino_t st_ino; /* i Node number */
		short st_mode; /* Mode bit */
		short st_nlink; /* Total number of links to the file */
		short st_uid; /* User id of the owner */
		short st_gid; /* Group id of the owner */
		dev_t st_rdev; /* For special files */
		off_t st_size; /* File length in characters */
		time_t st_atime; /* Time of last visit*/
		time_t st_mtime; /* Last modified time */
		time_t st_ctime; /* Last time i node was changed */

st_ The mode entry contains a series of flags describing the file, which are defined in < sys / stat.h >. We only need to deal with the relevant parts of the file type:

	#define S_IFMT 01600000 / * file type*/
	#define S_IFDIR 0040000 / * directory*/
	#define S_IFCHR 0020000 / * special characters*/
	#define S_IFBLK 00600000 / * special block*/
	#define S_IFREG 0010000 / * general*/
	/*  ... */

The directory information in the header file < sys / dir. H > is used as follows:

	#ifndef DIRSIZ
	#define DIRSIZ 14
	struct direct 
		/* Catalog entry */
		ino_t d_ino; /* i Node number */
		char d_name[DIRSIZ]; /* The long file name does not contain '\ 0' */

Instance - storage allocator

malloc calls the operating system when necessary to get more storage space.

Malloc does not allocate storage space from a fixed size array determined at compile time, but requests space from the operating system when needed. Because some places in the program may not apply for space through malloc call (that is, apply for space through other ways), the space managed by malloc is not necessarily continuous.
In this way, the free storage space is organized as a free block linked list. Each block contains a length, a pointer to the next block and a pointer to its own storage space.
These blocks are organized in ascending order of storage addresses, and the last block (the highest address) points to the first block.

The pointer returned by the malloc function will point to free space, not the head of the block.
The user can perform any operation on the obtained storage space, but if data is written outside the allocated storage space, the block linked list may be destroyed.

The block returned by malloc.

Learning References:

<C Programming language, 2nd Edition, new edition


Tags: C Unix server

Posted on Mon, 01 Nov 2021 08:05:21 -0400 by rostros