Lab02 - Extending and added new xv6 Commands
Deliverables due Mon Feb 10 by 11:59pm in your Lab02 GitHub repo
- A copy of xv6 with
- Modified commands:
wc
,cat
- New commands:
sort
,genfile
,diff
- Modified commands:
- You source code should conform to xv6 formatting conventions.
- You can use uncrustify to help automate formatting
- https://github.com/uncrustify/uncrustify
- VS Code Extension for Uncrustify
- Here is an Uncrustify config file for xv6: xv6-format.cfg
- Your solutions should pass the autograder tests
- I will provide a starter repo with some added files and a script for the autograder
- Use can use Aider, Roo Code, and other coding assistants, but your are responsible for understand any code you submit. You are also responsible for understanding the existing code you modify.
Overview
Currently, xv6 only provides a limited number of common Unix commands and the commands that are provided are simple implementations. In this lab we are going to extend two existing commands (wc
and cat
) and add new commands (sort
, genfile
, diff
). In order to support these commands you will need to understand the existing source code, basic usage of command line arguments, file descriptors, and file I/O with (open()
, read()
, write()
, and close()
).
wc options (EASY)
In Linux and xv6, by default, the wc
command shows the line count, word count, and character count of list of files:
$ wc README
46 319 2292 README
However, in Linux, you can provide the following command line options:
wc [-l] [-w] [-c]
Using these options can control the output of the wc
command:
-l lines
-w words
-c characters
By default without options, wc
prints all three of these values.
$ wc -l README
46 README
For this part of the lab you need to modify the existing implemenation of wc
in xv6 to support these new options.
Note that wc can also take multiple files as arguments:
$ wc README domains.txt
46 319 2292 README
4 8 106 domains.txt
The options you add should still work with multple files:
$ wc -l README domains.txt
46 README
4 listtest.txt
Finally, the wc
command can also take input from stdin, your modified wc
should also be able to work with stdin:
$ cat README | wc -l
46
Note that the options can be provided in any order.
Line numbers for cat (MODERATE)
The cat
command, by default, simply outputs the contents of the files provided as arguments:
$ cat domains.txt
[google.com, 142.251.46.174]
[usfca.edu, 23.185.0.2]
[mit.edu, 104.90.21.210]
[openai.com, 13.107.238.57]
However, cat
on Linux allows for the -n
option which prepends line number to each line in the output:
$ cat -n domains.txt
1 [google.com, 142.251.46.174]
2 [usfca.edu, 23.185.0.2]
3 [mit.edu, 104.90.21.210]
4 [openai.com, 13.107.238.57]
Note: The line number column is 8 characters. 6 for the line number and 2 spaces after the line number. Also, domains.txt
is included in the starter repo.
For this part of the lab you need to modify the existing xv6 implementation of cat to support for the -n
option.
Just like wc
, cat
can accept multiple filenames as arguments and it can also accept input from stdin. Your support for -n
should work both with multiple files and stdin
. Also like wc
you can assume the -n
is the first argument to cat if it exists. For example:
$ cat domains.txt | cat -n
1 [google.com, 142.251.46.174]
2 [usfca.edu, 23.185.0.2]
3 [mit.edu, 104.90.21.210]
4 [openai.com, 13.107.238.57]
Hint: the xv6 implemenation of cat
simply reads in chunks of bytes from the input and then writes them to stdout
(1). However, we need to read in a line at a time. To help you out, here is an implemenation of readline()
which uses the read()
system call to read an entire line from a file or stdin:
int
readline(int fd, char *buf, int maxlen)
{
int n;
char c;
int i = 0;
/* Read one character at a time from fd */
while((n = read(fd, &c, 1)) > 0){
buf[i] = c;
/* Look for the newline character */
if (c == '\n'){
/* We are at the end of the line, so stop reading */
break;
}
i += 1;
/* We don't want to read more characters than we have room */
if(i >= (maxlen - 1)){
/* We can't recover, so just print a message and exit */
fprintf(2, "readline() - line too long\n");
exit(-1);
}
}
/* This is a little tricky. If read() returns 0 AND we didn't
read previous characters for this line, then we want to return 0.
Also, if read returns a value less than 0, we want to return this
error condition. */
if(((n == 0) && (i == 0)) || (n < 0))
return n;
/* Add the null terminator to the end for the string buffer */
i += 1;
buf[i] = '\0';
return i;
}
Adding the sort command to xv6 (HARD)
The sort
command in Linux allows you to sort the lines of a text file:
$ cat states.txt
Nebraska
Alabama
California
Oregon
Georgia
$ sort states.txt
Alabama
California
Georgia
Nebraska
Oregon
The sort
command can also accept input from stdin
:
$ cat states.txt | sort
Alabama
California
Georgia
Nebraska
Oregon
By default sort
will sort the lines in lexicographic order. Your sort
should also accept the -r
option which will sort in reverse order.
For the implementation you can assume some maximum values and used a fixed array of strings for sorting:
#define MAX_LINES 1000
#define MAX_LINE_LEN 100
char lines[MAX_LINES][MAX_LINE_LEN];
You can use any sorting algorithm you want, but insertion sort is pretty simple:
InsertionSort(array A)
for i from 1 to length(A) - 1 do
key ← A[i]
j ← i - 1
// Move elements of A[0..i-1], that are greater than key,
// to one position ahead of their current position
while j ≥ 0 and A[j] > key do
A[j + 1] ← A[j]
j ← j - 1
A[j + 1] ← key
Adding genfile to xv6
In order to make it easier to test grep and sort below, you are going to write a simple program that can generate files in a simple way. This will also give you experience with creating files in addition to reading files.
The genfile
command will takes the following arguments:
genfile <filename> <count> <string1> [<string2> ...]
The <filename>
is the file to be created, the <count>
will repeat the strings provided count
times in the file:
$ genfile foo.txt 3 foo bar baz
$ cat foo.txt
foo
bar
baz
foo
bar
baz
foo
bar
baz
Note the genfile
command should overwrite <filename>
if it exists. To do this you need to use the O_TRUNC
mode when opening the file, like this:
fd = open(argv[1], O_CREATE | O_WRONLY | O_TRUNC)
Adding diff to xv6
The diff
command compares to files line by line and prints lines that are different
$ cat foo.txt
foo
baz
bar
$ cat bar.txt
foo
goo
bar
$ diff foo.txt bar.txt
line 2
< baz
---
> goo
$ cat > zoo.txt
foo
baz
zoo
$ diff foo.txt zoo.txt
line 3
< bar
---
> zoo
This shows that line 2 in foo.txt is different that line 2 bar.txt. The line from each file is displayed.
You can use readline()
for diff.