Lab02 - Extending and added new xv6 Commands

Deliverables due Mon Feb 10 by 11:59pm in your Lab02 GitHub repo

A copy of xv6 with
- Modified commands: wc, cat
- New commands: sort, genfile, diff
You source code should conform to xv6 formatting conventions.
- You can use uncrustify to help automate formatting
- https://github.com/uncrustify/uncrustify
- VS Code Extension for Uncrustify
- Here is an Uncrustify config file for xv6: xv6-format.cfg
Your solutions should pass the autograder tests
I will provide a starter repo with some added files and a script for the autograder
Use can use Aider, Roo Code, and other coding assistants, but your are responsible for understand any code you submit. You are also responsible for understanding the existing code you modify.

Overview

Currently, xv6 only provides a limited number of common Unix commands and the commands that are provided are simple implementations. In this lab we are going to extend two existing commands (wc and cat) and add new commands (sort, genfile, diff). In order to support these commands you will need to understand the existing source code, basic usage of command line arguments, file descriptors, and file I/O with (open(), read(), write(), and close()).

wc options (EASY)

In Linux and xv6, by default, the wc command shows the line count, word count, and character count of list of files:

$ wc README
46 319 2292 README

However, in Linux, you can provide the following command line options:

wc [-l] [-w] [-c]

Using these options can control the output of the wc command:

-l lines
-w words
-c characters

By default without options, wc prints all three of these values.

$ wc -l README
46 README

For this part of the lab you need to modify the existing implemenation of wc in xv6 to support these new options.

Note that wc can also take multiple files as arguments:

$ wc README domains.txt
46 319 2292 README
4 8 106 domains.txt

The options you add should still work with multple files:

$ wc -l README domains.txt
46 README
4 listtest.txt

Finally, the wc command can also take input from stdin, your modified wc should also be able to work with stdin:

$ cat README | wc -l
46

Note that the options can be provided in any order.

Line numbers for cat (MODERATE)

The cat command, by default, simply outputs the contents of the files provided as arguments:

$ cat domains.txt
[google.com, 142.251.46.174]
[usfca.edu, 23.185.0.2]
[mit.edu, 104.90.21.210]
[openai.com, 13.107.238.57]

However, cat on Linux allows for the -n option which prepends line number to each line in the output:

$ cat -n domains.txt
[google.com, 142.251.46.174]
[usfca.edu, 23.185.0.2]
[mit.edu, 104.90.21.210]
[openai.com, 13.107.238.57]

Note: The line number column is 8 characters. 6 for the line number and 2 spaces after the line number. Also, domains.txt is included in the starter repo.

For this part of the lab you need to modify the existing xv6 implementation of cat to support for the -n option.

Just like wc, cat can accept multiple filenames as arguments and it can also accept input from stdin. Your support for -n should work both with multiple files and stdin. Also like wc you can assume the -n is the first argument to cat if it exists. For example:

$ cat domains.txt | cat -n
[google.com, 142.251.46.174]
[usfca.edu, 23.185.0.2]
[mit.edu, 104.90.21.210]
[openai.com, 13.107.238.57]

Hint: the xv6 implemenation of cat simply reads in chunks of bytes from the input and then writes them to stdout (1). However, we need to read in a line at a time. To help you out, here is an implemenation of readline() which uses the read() system call to read an entire line from a file or stdin:

int
readline(int fd, char *buf, int maxlen)
{
  int n;
  char c;
  int i = 0;

  /* Read one character at a time from fd */
  while((n = read(fd, &c, 1)) > 0){
    buf[i] = c;
    /* Look for the newline character */
    if (c == '\n'){
      /* We are at the end of the line, so stop reading */
      break;
    }
    i += 1;
    /* We don't want to read more characters than we have room */
    if(i >= (maxlen - 1)){
      /* We can't recover, so just print a message and exit */
      fprintf(2, "readline() - line too long\n");
      exit(-1);
    }
  }
  /* This is a little tricky. If read() returns 0 AND we didn't
     read previous characters for this line, then we want to return 0.
     Also, if read returns a value less than 0, we want to return this
     error condition. */
  if(((n == 0) && (i == 0)) || (n < 0))
    return n;

  /* Add the null terminator to the end for the string buffer */
  i += 1;
  buf[i] = '\0';
  return i;
}

Adding the sort command to xv6 (HARD)

The sort command in Linux allows you to sort the lines of a text file:

$ cat states.txt
Nebraska
Alabama
California
Oregon
Georgia

$ sort states.txt
Alabama
California
Georgia
Nebraska
Oregon

The sort command can also accept input from stdin:

$ cat states.txt | sort
Alabama
California
Georgia
Nebraska
Oregon

By default sort will sort the lines in lexicographic order. Your sort should also accept the -r option which will sort in reverse order.

For the implementation you can assume some maximum values and used a fixed array of strings for sorting:

#define MAX_LINES 1000
#define MAX_LINE_LEN 100

char lines[MAX_LINES][MAX_LINE_LEN];

You can use any sorting algorithm you want, but insertion sort is pretty simple:

InsertionSort(array A)
    for i from 1 to length(A) - 1 do
        key ← A[i]
        j ← i - 1

        // Move elements of A[0..i-1], that are greater than key,
        // to one position ahead of their current position
        while j ≥ 0 and A[j] > key do
            A[j + 1] ← A[j]
            j ← j - 1

        A[j + 1] ← key

Adding genfile to xv6

In order to make it easier to test grep and sort below, you are going to write a simple program that can generate files in a simple way. This will also give you experience with creating files in addition to reading files.

The genfile command will takes the following arguments:

genfile <filename> <count> <string1> [<string2> ...]

The <filename> is the file to be created, the <count> will repeat the strings provided count times in the file:

$ genfile foo.txt 3 foo bar baz
$ cat foo.txt
foo
bar
baz
foo
bar
baz
foo
bar
baz

Note the genfile command should overwrite <filename> if it exists. To do this you need to use the O_TRUNC mode when opening the file, like this:

fd = open(argv[1], O_CREATE | O_WRONLY | O_TRUNC)

Adding diff to xv6

The diff command compares to files line by line and prints lines that are different

$ cat foo.txt
foo
baz
bar
$ cat bar.txt
foo
goo
bar
$ diff foo.txt bar.txt
line 2
< baz
---
> goo
$ cat > zoo.txt
foo
baz
zoo
$ diff foo.txt zoo.txt
line 3
< bar
---
> zoo

This shows that line 2 in foo.txt is different that line 2 bar.txt. The line from each file is displayed.

You can use readline() for diff.