Linux Programming Notes
Page Contents
To Read
- http://stackoverflow.com/questions/31755790/sockets-unix-domain-udp-c-recvfrom-fail-to-populate-the-source-address?noredirect=1#comment51445587_31755790
- http://www.toptip.ca/2013/01/unix-domain-socket-with-abstract-socket.html?m=1
- http://man7.org/linux/man-pages/man7/unix.7.html
- http://stackoverflow.com/questions/14643571/localsocket-communication-with-unix-domain-in-android-ndk
- http://www.informit.com/articles/article.aspx?p=366888&seqNum=8
- http://www.thegeekstuff.com/2013/07/linux-process-life-cycle/
- Todo: async-signal safe
Processes, Process Groups & Sessions
References
- https://www.win.tue.nl/~aeb/linux/lk/lk-10.html
Processes
A process has its own independent address space, isolating it from all other processes
in the system. I.e., a process cannot access the memory of another processes directly.
This first process in a Linux system is the init
process, with a PID of 1. Each
process in the system is assigned a unique integer to identify it, called the Process IDentifier,
or PID.
Processes are created in Linux by fork()
ing an existing process. In the
beginning Linux would copy the processes in its entirety: the parent process' memory would be cloned
for the new child process and the page tables for the child would be created to
"point" correctly to the new memory. That's expensive as the
system has to copy a potentially large amount of memory. For example, if a huge processes
using, say, 1.5GB of RAM just wanted to exec
a really small utility, the 1.5GB
of memory is copied only to be immediately used for a process requiring minimal memory, say
5MB! What a waste of time!
That is why modern Linux now uses copy-on-write pages. This way the memory space of the
parent processes is only copied to the child if the child tries to write to it. Therefore,
in the above example, the parent and child will share the same memory until the child
exec
s another program. Therefore the potentially huge memory copy is avoided.
Should the child modify the shared memory, a copy of the addressed memory page(s) are
created for the child, but only the modified pages need be copied, so it is again as
efficient as possible.
... Under Linux, fork(2) is implemented using copy-on-write pages, so the only penalty incurred by fork(2) is the time and memory required to duplicate the parent's page tables, and to create a unique task structure for the child. However, in the bad old days a fork(2) would require making a complete copy of the caller's data space, often needlessly...
Create Daemons
References
- http://www.tldp.org/LDP/intro-linux/html/sect_04_02.html
- https://www.freedesktop.org/software/systemd/man/daemon.html
What Is A Daemon?
A daemon is a Linux process that runs "in the background". This means that it is
not visible to the user: it does not output anything to the screen, via a terminal, for
example. It is also a direct child of init
so that it is not dependent on
any other process staying alive (at least directly).
File System Notifications
INotify
You can use the inotify
APIs to "listen" for events relating
to individual files or even directories.
You create inotify handles to which you can add watch groups to. This handle can then be used to receive events on all of the groups of files/directories that you are watching.
So lets, for example, watch a directory. You can get the example code here. I won't just splurge it all out here, we'll just look at the important bits.
To start receiving events relating to files/directory you need to create an inotify file descriptor:
int inotifyFd = inotify_init();
To tie a directory/file to this file descriptor use the following:
watchDescriptor = inotify_add_watch(inotifyFd, argv[1], IN_ALL);
In the example code I do no command line checking so the first argument to the script is the file
or directory being watched. The macro IN_ALL
is my own macro that is just a combination
of all the types of events that can be received.
To receive events you must read()
from the inotify file descriptor:
bytesRead = read(inotifyFd, buffer, sizeof(buffer))
Here is the main point to note here is that the size of the buffer
is
much larger than sizeof(struct inotify_event)
! The reason for this is
that the inotify_event
strcture contains as its last member an unsized
array. I think this was a C99 thing and I'm not sure it is even officially supported
in C++, but not getting any errors or warnings so it looks fine.
The last element name
is an unsized array, which means that the actual size
of the array is the sizeof(struct inotify_event) + inotify_event.len
, where
the len
field gives the byte-length (includes all null bytes after string).
This is why I read data into buffer
. To read at least one event buffer
needs to be at least sizeof(struct inotify_event) + NAME_MAX + 1
bytes in size. Note
that reading events will only read entire events, it will never split an event structure across
two reads, for example, therefore you can be certain to only ever read an integer number
of events.
So the buffer
has one or possibly more events in it. Hence once
read()
fills the buffer we must traverse across all the
inotify_events
contained within:
const char *const bufferEnd = buffer + bytesRead; while(buffer < bufferEnd) { struct inotify_event *iNotifyEvent = reinterpret_cast<struct inotify_event *>(buffer); dump_inotify_event(iNotifyEvent); buffer += sizeof(struct inotify_event) + iNotifyEvent->len; }
This code does a reinterpret_cast
, which means the buffer must
be correctly aligned: if you statically allocate a buffer you must make sure it is
correctly aligned. To work around this I've dynamically allocated the buffer which
guarantees correct alignment.
The address of buffer[0]
is the start of the first event struct.
To get to the start of the next structure we forward the pointer
sizeof(struct inotify_event) + iNotifyEvent->len
bytes. This is the
size of the structure plus the size of the file name string and all of the
NULL bytes after it: the name has the terminating NULL byte but also as many extra
NULL bytes required to pad the subsequent structure to the correct alignment.
Thus we can increment the pointer in this way without
worrying about alignment within the buffer. Happy days!
To cleanup we must remove the watch on the directory/file and then close the inotify descriptor:
inotify_rm_watch(inotifyFd, watchDescriptor); close(inotifyFd);
Select, Poll, EPoll
Select
The Linux man page says that:
select() ... allow[s] a program to monitor multiple file descriptors, waiting until one or more of the file descriptors become "ready" for some class of I/O operation ... (e.g. without blocking or a sufficiently small write) ...
Select will wait on a set of file descriptors with a timeout. You specify three sets of file descriptor to watch for three different events:
- File descriptors becoming ready for reading,
- File descriptors becoming ready for writing,
- File descriptors suffering exceptional conditions.
The sets of file descriptors being watched for events are described by fd_set
s.
A set is created and manipulated as follows:
fd_set fd_set; FD_ZERO(&fd_set); // CLEAR the set FD_SET(file_descriptor, &fd_set); // ADD a file descriptor to the set FD_CLR(file_descriptor, &fd_set); // REMOVE a file descriptor to the set if (FD_ISSET(file_descriptor, &fd_set)) // TEST if fd is part of set ; // file_descriptor is part of fd_set
Note that select()
overwrites the fd_set
variables you
pass it so, if you use it in a loop remember to re-initialise the set each time!
Poll
EPoll
Select Vs Poll
I found Daniel Stenberg's analysis a very good read for this.