Analysing Pipe operating In C
This article is an exploration of the pipe
function in C. The
pipe
function is a method of IPC (Inter-process communication). Pipes are
unidirectional and can only be created between a child and a parent process*.
*: Technically, you can create a pipe between two programs in bash but that
pipe isn't controlled by the programs and it's just a way to redirect stdout
into the other program's stdin
.
Before we being, let's first understand the "framework" of the program. I'll
present the main
function and the #define
macros here all will not repeat
this code again. This is the same framework that will be used for all future
analysis unless explicitly stated otherwise.
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <sys/types.h>
#define BUFFER_SIZE 50
#define READ_END 0
#define WRITE_END 1
/* These functions will be defined later */
void child_operations(int fd[2], pid_t pid, char *read_msg);
void parent_operations(int fd[2], pid_t pid, char *write_msg);
int main(int argc, char *argv[]) {
char write_msg[BUFFER_SIZE] = "Greeting";
char read_msg[BUFFER_SIZE];
int fd[2];
pid_t pid;
for (int i=0; i<BUFFER_SIZE; i++) {
read_msg[i] = '\0';
}
// Create the pipe
if (pipe(fd) == -1) {
fprintf(stderr, "Pipe failed.\n");
return 1;
}
// Fork a child process
pid = fork();
if (pid < 0) {
fprintf(stderr, "Fork failed.\n");
return 1;
}
if (pid > 0) parent_operations(fd, pid, write_msg);
else child_operations(fd, pid, read_msg);
return EXIT_SUCCESS;
}
Simple case
Here's the first versions of the parent_operations
and child_operations
functions:
void child_operations(int fd[2], pid_t pid, char *read_msg) {
// Close the write end of the pipe
close(fd[WRITE_END]);
printf("C: Child process about to read.\n");
read(fd[READ_END], read_msg, BUFFER_SIZE);
printf("C: Read '%s'\n", read_msg);
close(fd[READ_END]);
printf("C: Child process exiting.\n");
exit(EXIT_SUCCESS);
}
void parent_operations(int fd[2], pid_t pid, char *write_msg) {
// Close unused end of the pipe
close(fd[READ_END]);
// Sleep for one second so that
printf("P: Parent entering sleep (sls 1).\n");
sleep(1);
printf("P: Parent awake (sla 1).\n");
write(fd[WRITE_END], write_msg, strlen(write_msg) + 1);
printf("P: Completed write operations to pipe.\n");
printf("P: Parent sleeping before closing write end (sls 2).\n");
sleep(1);
printf("P: Parent awakens from sleep (sla 2).\n");
close(fd[WRITE_END]);
printf("P: Parent sleeping before exiting (sls 3).\n");
sleep(1);
printf("P: Parent awakens from sleep (sla 3).\n");
printf("P: Parent process exiting.\n");
exit(EXIT_SUCCESS);
}
Here is the output:
$ ./main
P: Parent entering sleep (sls 1).
C: Child process about to read.
P: Parent awake (sla 1).
P: Completed write operations to pipe.
P: Parent sleeping before closing write end (sls 2).
C: Read 'Greeting'
C: Child process exiting.
P: Parent awakens from sleep (sla 2).
P: Parent sleeping before exiting (sls 3).
P: Parent awakens from sleep (sla 3).
P: Parent process exiting.
Here we see that clearly the child is "waiting" for the parent to write to the
pipe before it reads it. However, it doesn't wait for the parent to close it's
side of the pipe, it doesn't wait for the parent to exit. It only waits for the
parent to complete it's write()
function before trying to read from the
buffer.
Multiple Writes
Let's see what happens if the parent tries to write to the pipe multiple times?
Difference in this code:
- Parent calls the
write
function twice with the same string.
- Child now tries to print the both parts of the
write
by shifting the
pointer.
Here's the parent process:
void parent_operations(int fd[2], pid_t pid, char *write_msg) {
// Close unused end of the pipe
close(fd[READ_END]);
int sleep_count = 1;
// Sleep for one second so that
printf("P: Sleep before first write (sls %d).\n", sleep_count);
sleep(1);
printf("P: Awake (sla %d).\n", sleep_count);
sleep_count++;
write(fd[WRITE_END], write_msg, strlen(write_msg) + 1);
printf("P: Completed write operations to pipe.\n");
printf("P: Sleep before seond write (sls %d).\n", sleep_count);
sleep(1);
printf("P: Awake (sla %d).\n", sleep_count);
sleep_count++;
// Write the same thing again.
write(fd[WRITE_END], write_msg, strlen(write_msg) + 1);
printf("P: Completed write operations to pipe.\n");
printf("P: Sleeping before closing write end (sls %d).\n", sleep_count);
sleep(1);
printf("P: Awakens from sleep (sla %d).\n", sleep_count);
sleep_count++;
close(fd[WRITE_END]);
printf("P: Parent sleeping before exiting (sls %d).\n", sleep_count);
sleep(1);
printf("P: Parent awakens from sleep (sla %d).\n", sleep_count);
sleep_count++;
printf("P: Parent process exiting.\n");
exit(EXIT_SUCCESS);
}
Here is the child process:
void child_operations(int fd[2], pid_t pid, char *read_msg) {
// Close the write end of the pipe
close(fd[WRITE_END]);
printf("C: Child process about to read.\n");
read(fd[READ_END], read_msg, BUFFER_SIZE);
printf("C: Read Print 1 '%s'\n", read_msg);
printf("C: Read Print 2 '%s'\n", read_msg + strlen(read_msg) + 1);
close(fd[READ_END]);
printf("C: Child process exiting.\n");
exit(EXIT_SUCCESS);
}
// If both are read:
['G', 'r', 'e', ... , '\0', 'G', 'r', 'e' ... ];
// If it is only read once:
['G', 'r', 'e', ... , '\0', '\0', '\0', '\0' ... ];
We'll lets look at the output:
$ ./main
P: Sleep before first write (sls 1).
C: Child process about to read.
P: Awake (sla 1).
P: Completed write operations to pipe.
P: Sleep before seond write (sls 2).
C: Read Print 1 'Greeting'
C: Read Print 2 ''
C: Child process exiting.
P: Awake (sla 2).
~/code/pipe-ipc main !1 ············· ✘ PIPE
There are some interesting observations that we can make based on this output:
- The
read
in the child only waiting for the first write
in the parent
program.
This makes sense since the child would not know how many writes there are in the
parent program therefore it wouldn't make sense to wait for more than 1.
- The parent process exited with an error because the pipe was 'broken' when the
child process exited.
We see here that if the child process exits and the parent tries to write to the
pipe then we get the error that 'pipe was broken' and the parent terminates.
This is shown by my terminal in the last line.
Multiple Writes And Multiple Reads
Let's see what happens if we try to read multiple times in the child after we
write multiple times. I'm not going to change the parent code, I'm only going to
add a line to read again in the child.
Child Process code:
void child_operations(int fd[2], pid_t pid, char *read_msg) {
// Close the write end of the pipe
close(fd[WRITE_END]);
printf("C: Child process about to read.\n");
read(fd[READ_END], read_msg, BUFFER_SIZE);
printf("C: Read Print 1 '%s'\n", read_msg);
printf("C: Child process about read again.\n");
read(fd[READ_END], read_msg + strlen(read_msg) + 1, BUFFER_SIZE);
printf("C: Read Print 2 '%s'\n", read_msg + strlen(read_msg) + 1);
close(fd[READ_END]);
printf("C: Child process exiting.\n");
exit(EXIT_SUCCESS);
}
And here is the output:
$ ./main
P: Sleep before first write (sls 1).
C: Child process about to read.
P: Awake (sla 1).
P: Completed write operations to pipe.
P: Sleep before seond write (sls 2).
C: Read Print 1 'Greeting'
C: Child process about read again.
P: Awake (sla 2).
P: Completed write operations to pipe.
P: Sleeping before closing write end (sls 3).
C: Read Print 2 'Greeting'
C: Child process exiting.
P: Awakens from sleep (sla 3).
P: Parent sleeping before exiting (sls 4).
P: Parent awakens from sleep (sla 4).
P: Parent process exiting.
Here we see that everything works fine again. There seems to not be any problems
and each read
reads the data from the corresponding write
. Just to make
sure, I updated the content that is being written the second time to be
something different and here is the output of that:
$ ./main
P: Sleep before first write (sls 1).
C: Child process about to read.
P: Awake (sla 1).
P: Completed write operations to pipe.
P: Sleep before seond write (sls 2).
C: Read Print 1 'Greeting'
C: Child process about read again.
P: Awake (sla 2).
P: Completed write operations to pipe.
P: Sleeping before closing write end (sls 3).
C: Read Print 2 'Some new msg'
C: Child process exiting.
P: Awakens from sleep (sla 3).
P: Parent sleeping before exiting (sls 4).
P: Parent awakens from sleep (sla 4).
P: Parent process exiting
Hypothesis
I suspect the following features of this system:
- The
read
call waits for something to be written to the pipe before it
reads.
- The
read
actually reads the number of bytes specified by the last
parameter.
- If the
read
command doesn't read as much as the write
wrote, you could
use multiple reads to read a single write.
- If the
read
command reads enough data, you could use one read
to read
multiple write
commands, you'd just need to wait before reading.
Test 1
This test is going to try and confirm points 2,3 from the hypothesis. We can
check if we can perform multiple reads by just reading less than what we wrote.
I'll try and read a single character at a time.
Here is the child process code:
void child_operations(int fd[2], pid_t pid, char *read_msg) {
// Close the write end of the pipe
close(fd[WRITE_END]);
printf("C: Child process about to read.\n");
// Keep reading as long as a character is available
while (read(fd[READ_END], read_msg, 1) > 0) {
printf("C: Read character: '%c'\n", read_msg[0]);
}
close(fd[READ_END]);
printf("C: Child process exiting.\n");
exit(EXIT_SUCCESS);
}
The parent process is similar to the one in the first test. It just writes
"Greeting" to the pipe and exits.
Here's the output of the code:
$ ./main
P: Sleep before first write (sls 1).
C: Child process about to read.
P: Awake (sla 1).
P: Completed write operations to pipe.
P: Sleeping before closing write end (sls 2).
C: Read character: 'G'
C: Read character: 'r'
C: Read character: 'e'
C: Read character: 'e'
C: Read character: 't'
C: Read character: 'i'
C: Read character: 'n'
C: Read character: 'g'
C: Read character: ''
P: Awakens from sleep (sla 2).
P: Parent sleeping before exiting (sls 3).
C: Child process exiting.
P: Parent awakens from sleep (sla 3).
P: Parent process exiting.
Here we see that there are multiple calls to read
. We also see that the read
stops once the number of characters read reduce to 0.
However, shouldn't the read wait for input if the number of read characters is
0? Should the last call to read just wait for another write from the parent?
What would happen if added ANOTHER read at the end of the loop?
Let's add a read
call right after the loop to see what happens:
read(fd[READ_END], read_msg, 1);
printf("C: Final Read character: '%c'\n", read_msg[0]);
Here's the output:
$ ./main
P: Sleep before first write (sls 1).
C: Child process about to read.
P: Awake (sla 1).
P: Completed write operations to pipe.
P: Sleeping before closing write end (sls 2).
C: Read character: 'G'
C: Read character: 'r'
C: Read character: 'e'
C: Read character: 'e'
C: Read character: 't'
C: Read character: 'i'
C: Read character: 'n'
C: Read character: 'g'
C: Read character: ''
P: Awakens from sleep (sla 2).
P: Parent sleeping before exiting (sls 3).
C: Final Read character: ''
C: Child process exiting.
P: Parent awakens from sleep (sla 3).
P: Parent process exiting.
From this we can conclude that read
waits for data to be written if there
isn't data present as long as some process has the write end open. If no
process has the write end open then it returns a value that's less than 1
(not
sure if this would be 0
or -1
).
read
usually returns the number of characters read which might not be equal to
the provided length (in case read stops in the middle or if there isn't enough
data to read).
Conclusion
I've added some information here based on the experiments, reading the man pages
for pipe
and referring to some online sources.
#include <unistd.h>
int pipe(int fildes[2]);
- The
pipe
function is defined in <unistd.h>
.
- It creates a pipe and stores the file descriptors in the array that was
passed as a parameter.
- The first element is the file descriptor for the read end of the pipe.
- The second element is the file descriptor for the write end of the pipe.
- The function shall mark for update the last data access, last data
modification, and last file status change timestamps of the pipe. This is how
the
read
function knows when a write
has completed or the write end has
been closed.
- If pipe is empty and we call
read
system call then read on the pipe will
return EOF
(return value 0) if no process has the write end open.
- If some other process has the pipe open for writing,
read
will wait in
anticipation of new data.
- You can choose how much data to read from the pipe using the last parameter
of the
read
function.
Here are the declarations of the read
, write
system calls:
#include <unistd.h>
ssize_t read(int fd, void buf[.count], size_t count)
ssize_t write(int fd, const void buf[.count], size_t count)