------------------------------------------------------------------------------- CSCE 230 Homework 0 (Bf version) Readme Brian Kell bkell@cse.unl.edu January 26, 2003 ------------------------------------------------------------------------------- With the goal of creating the strangest program submitted for homework 0, I decided to write a version of the program in the venerable language Bf. This enabled me to write a unique algorithm for determining the binary representation of an integer, very probably completely unlike any of the other submissions. To begin a discussion on the operation of this program, a brief discussion of the Bf language is in order. ----- The Bf language --------------------------------------------------------- Bf was conceived by Urban Mueller, who first implemented it on the Amiga computer system. The full name of the language is Brainfuck, which aptly describes the feeling one experiences while trying to write anything useful with it. However, I have decided to follow Frans Faase's convention of referring to the language as Bf (although Faase capitalizes the F, which may admittedly be a more justified practice). The language is based on the concept of the Turing machine, a very simple computing model formulated by Alan Turing, which has been proven to have the capability to compute any function that can be calculated by a classical computer (although this is not true for quantum computers). More information about the Turing machine, along with three proofs that Bf is Turing-complete, can be found at http://home.wxs.nl/~faase009/Ha_bf_Turing.html. Bf views computer memory as an array of bytes, all initially set to zero. It also provides for a single pointer, which initially points to the first memory location. There are eight commands in the language: + Increments the value in the memory location under the pointer. - Decrements the value in the memory location under the pointer > Moves the pointer one memory location to the right. < Moves the pointer one memory location to the left. , Reads the next character from the input and stores its numeric representation in the memory location under the pointer. . Writes the character corresponding to the value in the memory location under the pointer to the output. [ If the value in the memory location under the pointer is zero, moves program execution to the command immediately following the corresponding "]". ] If the value in the memory location under the pointer is not zero, moves program execution to the command immediately following the corresponding "[". Any character other than these eight is considered a comment. These commands can be expressed in C as follows, where p is the pointer: + ++*p; - --*p; > ++p; < --p; , p = getchar(); . putchar(p); [ while (*p) { ] } It should be obvious from this short description that writing Bf programs of any practical value whatsoever is nearly impossible. It is good to keep in mind while trying to write Bf that this phrase "nearly impossible" has been mathematically proven to mean "definitely possible" (although it has not been proven to mean "sane"). ----- Program description ----------------------------------------------------- Since there are only eight commands in the language, and since they represent such elementary operations, the interpreter was easy to write. With the exception of one minor bug in my bracket-handling code, the interpreter was finished in less than half an hour. Obviously, the Bf code presented much more of a challenge than the interpreter. After I realized that I would need to create an equivalent to C's atoi function (since Bf provides only character-based input), and that Bf did not have any facilities to access individual bits of memory, I decided that my conversion routine would convert directly from characters to binary. The integer entered by the user is stored in 32 consecutive memory locations, each containing either a 0 or a 1. The contents of these memory locations are then written to output, producing the binary string. Therefore the majority of the work done by the program is performed in the section of code that converts the input received from the user into this 32-byte binary representation. The process implemented is relatively simple. The 32 bytes are first initialized to zero. (This is unnecessary for the first integer, since Bf automatically initializes all memory locations to zero, but is needed for successive integers.) Program execution then enters a loop, which consists of multiplying the binary value stored in the 32 bytes by 10 and adding the digit represented by the character read from input. This continues until there are no more characters to read, at which point the 32 bytes should contain the binary representation of the integer entered by the user. I quickly discovered that even the simple step of multiplying by 10 in Bf is not nearly as simple as it first sounds, especially since I was simulating a 32-bit integer with 32 consecutive bytes. The solution I found was as follows: The program first creates a copy of the current 32-byte value in another 32-byte range. One copy is then multiplied by four, by means of shifting each byte two places to the left. The two copies are added together, and this sum is then doubled by shifting the bytes one place to the left. This value is now ten times the original value. Of course, with every addition comes the problem of carrying. I needed to devise a carry routine to make sure that my bytes remained zeros and ones instead of becoming twos and threes. The basic form of my carry algorithm is [(>)+(<)-](>)[-[-[-(<)+(>)](<)-<+>(>)](<)+(>)](<) (In my Bf "pseudocode," the parentheses represent repetition of the enclosed angle brackets. These operations must be repeated a number of times sufficient to move the pointer to a memory location that is not currently used by another part of the program, to avoid overwriting memory.) If the current memory location contains a two or a three, this snippet of code will subtract two from this value and increment the next memory location to the left. It will leave a zero or a one as is. Another useful Bf algorithm I devised while working on this program was the NOT gate, which will convert a zero in the current memory location to a one, and vice versa. It looks like this: [(>)+>+<(<)-](>)[>--<-]>+[<(<)+(>)>-]<(<) I ended up not using this gate in my code, but it is nevertheless an interesting piece of Bf. It actually computes the opposite of the value in the current memory location and increments it, so 2 will be converted to -1. (The concept of -1 in Bf is implementation-defined; some interpreters allow negative numbers to be represented, while others will wrap around, causing a -1 to be represented as the largest representable positive number. In any case, "-1" is the value obtained by decrementing a zero value, and incrementing a "-1" will result in zero.) ----- Program execution ------------------------------------------------------- When the Bf version of BitPattern is run, it will produce output identical to that required by the homework description. The two largest deviations from the homework requirements both stem from the fact that Bf has no support for functions or subroutines of any kind. Bf also has no support for naming anything--not even variables. Both of these limitations are severe obstacles to writing a function named "printBits." I therefore decided that my only choice was to write the character-to-binary conversion code directly in the main loop of the program. I also decided to hard-code the three examples (the binary representations of 1, 175, and 143165576) directly into the opening text, rather than duplicating the conversion routine or implementing some overly complicated scheme that would somehow bypass the prompt and use predefined values for the first three iterations of the loop. The prompt asks for a positive integer in the range 1 through 2147483647 in order to fulfull the requirements of the homework. However, since the integer is represented internally by a 32-byte array representing an unsigned 32-bit integer, any integer in the range 1 through 4294967295 is acceptable input and will be handled correctly. The number 0 can also be entered, to end the program. Any other input will result in undefined program behavior, in the sense that I don't want to spend the time figuring out what it will do or implementing any sort of an error-checking system, which I imagine would be hell. I have discovered that the value 4294967296 results in a "pointer past end of memory" error, but I have no idea why and no desire to fix it. Letters or other non-numeric characters will be interpreted as representing the digit (A + 208) % 256, where A is the character's ASCII code. This is a natural consequence of the section of my conversion code that blindly converts the characters 0 through 9 (ASCII values 48 through 57) into their corresponding digits by subtracting 48. The most significant assumption made by this Bf program is that the character representing the end of each integer entered by the user will have the numeric value 10. This is a line feed in ASCII and the '\n' character on both Windows and UNIX systems (at least, when entered from the keyboard), so it seems like a reasonable assumption to make. However, this specifically means that if input is redirected from a file (for some reason) and the last line, which should consist of "0" in order to end the program, is not terminated with a newline, then something not desired will probably happen. Another assumption is that the user will enter only valid integers in the range 0 through 4294967295 inclusive, as mentioned above. This also means that if the user enters some strange input that somehow results in the value 000000000000000000000000 being stored in the 32-byte binary representation, the program will exit, because it will assume that the user entered a zero. Any string consisting entirely of zeros will of course fall prey to this assumption, but there may be some other goofy input that also results in this condition. It is entirely possible that a few bugs remain in this code. I have tested it somewhat thoroughly, and since fixing the bug that sometimes allowed the digits 2 and 3 to creep into the binary representation, all output has exactly matched the output produced by my C version of this program. However, should a bug manifest itself, feel free to let me know. I may or may not fix it. Note that the "undefined behavior" referred to several paragraphs above is not a bug. It is documented and therefore a feature. Needless to say, this program should not be used in mission-critical applications. ----- Justification ----------------------------------------------------------- The choice to use Bf for this program had several motivating reasons. First, the challenge of using Bf to write anything useful at all appealed to me. Since, I reasoned, homework 0 would be the easiest homework all semester, it would be the best chance I would have to write in Bf for a programming assignment. Second, the use of Bf is almost a natural extension to the assignment. I was forced to build my own 32-byte binary representation for integers that simulates the standard 32-bit representation common in today's computers. To achieve this, I needed to consider multiplication as a series of bit shifts and additions, and addition as a sequence of increments and carries. I designed constructs that would duplicate the operation of bitwise operators without having access to individual bits at all, or even the exact integer held in any given memory location. Computer organization is the topic of this course, which implies detailed knowledge of low-level computer operation. While computers are not normally built according to the specifications of a Turing machine, the two models are nevertheless closely intertwined. Writing a Bf program allowed me to explore strategies for achieving common functions, given only the most fundamental operations. Finally, Bf is esoteric enough to almost guarantee the title of "Strangest Program Submitted for Homework 0." If anyone submits anything stranger, I will be very amazed. ----- Sources and further information ----------------------------------------- The Cat's Eye Technologies Bf page, at http://www.catseye.mb.ca/esoteric/bf/, is widely regarded as Bf's home page. It contains a moderate amount of Bf material and is a good starting point for other Bf pages. Frans Faase's Bf page, found at http://home.wxs.nl/~faase009/Ha_BF.html, is one of the most complete Bf pages I could find. Faase has an amazing amount of Bf information, including a good tutorial that teaches some neat Bf tricks and an in-depth study of the generation of big numbers in Bf. Mike Coffey's Bf interpreter, written in JavaScript, is a very impressive piece of work. It is not fast, but it does have a nice interface, and is the foundation for several other interpreters found on the Web. It is an invaluable resource for testing short Bf snippets. I used it, for example, to test my "carry" code, among other things. http://justice.loyola.edu/~mcoffey/pr/5k/ Daniel B. Cristofani has a Bf page with a variety of programs at http://www.hevanet.com/cristofd/brainfuck/. ----- Appendix: CSCE 230 Homework 0 ------------------------------------------- from http://www.cse.unl.edu/~goddard/Courses/CSCE230J/Assignments/HW0.htm CSCE 230 Computer Organization Spring 2003 Steve Goddard Homework 0, January 16 (Total of 50 points) Evaluation of prerequisite knowledge Due: 9:00pm Thursday, January 23 _________ Individual Assignment. This is NOT a Team assignment. 50 points: This is a "simple" C programming problem to help familiarize you with some of the basic C language functions and the binary representation of numbers. The requirements are simple: a. Your C executable program will be named BitPattern. b. No input parameters on the command line will be required. c. The program will print the following to standard output (stdout), adhering to the format as shown. Binary representations for the following positive integers are as follows: 1 = 0000 0000 0000 0000 0000 0000 0000 0001 175 = 0000 0000 0000 0000 0000 0000 1010 1111 143165576 = 0000 1000 1000 1000 1000 1000 1000 1000 d. To accomplish the above, a function printBits(int x) should be designed and coded, which prints the line x = yyyy yyyy yyyy yyyy yyyy yyyy yyyy yyyy , to standard output (stdout) where x is a positive integer and the series of y’s represents the binary bit pattern of x. Display the most significant bit to the left in the series. The format must include the right justification of the integers, 2 spaces between the integer and the equals sign, 2 spaces between the equals sign and the MSB of the bit pattern, and then 2 spaces between each group of 4 bits. e. After the display (as described in item c) has been printed to screen, then the following 3 line prompt should be displayed: Enter a positive integer in the range of 1 to 2147483647 or Enter 0 to quit. _ One empty line should separate the last line of the initial display and the first line of the three line prompt. If 0 is entered, than the program should terminate. If an integer in the range of 1 to 2,4147,483,647 [sic] is entered, than the bit pattern for that integer is printed to stdout after which the above prompt is again displayed with one empty line between the bit pattern and the prompt. The format of the print out should match the first three bit patterns displayed. An example of the complete format is given below. Binary representations for the following positive integers are as follows: 1 = 0000 0000 0000 0000 0000 0000 0000 0001 175 = 0000 0000 0000 0000 0000 0000 1010 1111 143165576 = 0000 1000 1000 1000 1000 1000 1000 1000 Enter a positive integer in the range of 1 to 2147483647 or Enter 0 to quit. 127 127 = 0000 0000 0000 0000 0000 0000 0111 1111 Enter a positive integer in the range of 1 to 2147483647 or Enter 0 to quit. 0 f. The coding standard must be followed. g. Create and turn in a Makefile that will build your executable from your source file(s). The problem will be scored at follows: Make File 5% (2.5 pts) README.TXT file 5% (2.5 pts) Program correctness 50% (25 pts) Quality of design/readability 20% (13 pts) In-line documentation/coding standard 20% (12 pts) No analysis report is required, but a README.TXT file is required to explain program compilation and execution. Program correctness will be determined by the correct functionality of the program as well as adherence to format specifications. -------------------------------------------------------------------------------