Home Previous Next

CSC100 :: Lecture Note :: Week 02
Assignments | Code | Handouts | Resources | Email Thurman {Twitter::@compufoo Facebook::CSzero}
{GDT::Bits:: Time  |  Weather  |  Populations  |  Special Dates}

Overview

Assignments: #assessment1 (due 9/20/2017) and [program] #helloworld (due 9/24/2017)

Handout: DevC++ IDE

Code: HelloWorld.cpp | HelloWorld.c | BadHelloWorld.cpp | new C++ program template
PrimitiveTypes.cpp | Constants.cpp | ArithOps.cpp


What is a Computer?

A computer (hardware) is a programmable electronic device that can store, retrieve, and process data. [Via Wikipedia.org: "A computer is a general purpose device that can be programmed to carry out a finite set of arithmetic or logical operations."]

Data is information that has been put into a form (i.e. format) that a computer can use (i.e. understand). ["Big Data" is a popular buzzphrase in 2012. #BigData]

A computer is comprised of six logical components:

Wikipedia.org::Computer

Archive.org::How It Works... The Computer [original edition 1971; revised edition 1979]

Sloan.Stanford.edu::The First Computer Mouse (circa 1964) was invented by Doug Engelbart.

GDT::Computing::Bit::History of Computing Presented Using YouTube Videos [created 22 August 2009]

LongStreet.Typepad.com::The Computer Tree (1945-1960s)
[via Texas Advanced Computing Center (TACC) Facebook posting on 7 August 2012]

{TopOfPage} {Tutorial} {online IDEs: CodingGround) | CPP.sh | jdoodle} {C at MIT} {GDT::C/C++ Resource}


What are Bits, Bytes, and Words?

Memory consists of a sequence of binary digits (or bits). A bit is either on or off (true or false, 1 or 0). Bits are usually grouped into bytes. Typically, there are 8 bits to a byte. A computer that has 16 megabytes (MB) of memory has approximately 16 million (16,000,000) bytes. A group of bytes is often referred to a word.

tera- ... peta- ... exa-
1 kilobyte (1K) 1024 bytes (2 to the power of 10)
1 megabyte (1MB) 1024K or 1,048,576 bytes (2 to the power of 20)
1 gigabyte (1GB) 1024MB or 1,073,741,824 bytes (2 to the power of 30)

[More...] Wikipedia.org:: Binary prefix

{TopOfPage} {Tutorial} {online IDEs: CodingGround) | CPP.sh | jdoodle} {C at MIT} {GDT::C/C++ Resource}


Computer Languages: machine, assembly, high-level, 4GL

ThurmNet Technologies created a family of pretend programming languages to help discuss the various generations of programming languages.

1st-Generation: Machine Language (tnt1G)

The tnt1G machine language has four "commands" and supports the use of the hexadecimal digits 0 through F.

The commands are:

   0000 -- START
   0011 -- PRINT
   0001 -- END
   1111 -- ABORT

The supported hexadecimal digits are:

   0 -- 0000
   1 -- 0001
   2 -- 0010
   3 -- 0011
   4 -- 0100
   5 -- 0101
   6 -- 0110
   7 -- 0111
   8 -- 1000
   9 -- 1001
   A -- 1010
   B -- 1011
   C -- 1100
   D -- 1101
   E -- 1110
   F -- 1111

Each program begins by issuing a START command. The PRINT command takes one operand -- a DIGIT. The END command is used to terminate a program and it takes one operand -- a DIGIT that represents the program's exit status. The ABORT command abnormally terminates a program and it, too, takes an exit status operand.

The following is a tnt1G written program that prints my age at the end of 1999.

      0000001101000011001000011010

   To aid readability, let us assume it can be written as follows:

      0000
      0011 0100
      0011 0010
      0001 1010

GDT::Images:: Real programmers code in binary.

2nd-Generation: Assembly Language (tntASM)

The following is the same program written in tntASM (i.e. TNT assembly language):

   START 
   PRINT 4
   PRINT 2
   END A

Prior to executing, an assembly program is submitted to an assembler program that converts it into machine language.

3rd-Generation: High-level Language (tntC)

The following is the same program written in tntC (i.e. TNT 3rd-generation language):

   main() {
      printint 42; 
      exit SUCCESS;
   }

Prior to executing, a high-level language is submitted to a compiler program that translates it into machine language.

Some of the early popular 3rd-generation languages were BASIC, Fortran, and COBOL.

C is a 3rd-generation language created in 1972 and C++ was developed around 1980.

[Tidbit] When I decided to minor in CS in 1977 the first language learned was BASIC, followed by assembly then PL/I. In 1980 graduate school, Pascal was the primary language used.

4th-Generation: 4GL (tnt4GL)

Finally, the same program writting in tnt4GL (i.e. TNT 4th-generation language):

   print 42 and return success

[Definition] Portable refers to the ability to move a program from one machine to another without having to modify the code.

Machine and assembly languages are not portable. Many 3rd-generation languages are, but they portable if and only if programmers write with portability in mind.

[More...] Wikipedia.org:: Programming Languages

GDT::Computing::Bit:: What is a real programmer?

{TopOfPage} {Tutorial} {online IDEs: CodingGround) | CPP.sh | jdoodle} {C at MIT} {GDT::C/C++ Resource}


Introduction to the Software Development Cycle

Question: Why do we write programs?

Answer: We write programs typically in response to a need from a "user."

The user generates a requirements (or specification) document and gives it the programmer. The programmer reviews the requirements. If they are understood, then work begins on the program(s); otherwise, the user addresses the concerns raised by the programmer and updates the requirements. The reviewing of the requirements can be (and usually is) an iterative process.

Do not work on a program unless you understand what you are being asked to program.

[More...] Wikipedia.org:: Software development cycle

[More...] Images.Google.com:: waterfall software development cycle

{TopOfPage} {Tutorial} {online IDEs: CodingGround) | CPP.sh | jdoodle} {C at MIT} {GDT::C/C++ Resource}


The C and C++ Compilation Process

C and C++ are high-level, 3rd-generation languages. These types of languages must be translated into a machine language in order to be executed by a CPU. The process of translating high-level language into machine language is called the compilation process.

The compilation process consists of the following steps.

   edit source code ->  compile   ->   link    ->   execute
      (editor)         (compiler)    (linker)       (loader)

Program source code is entered into a file using a text editor. After the code has been entered, a compiler program is started that translates the source into an object code file. The object code file is linked with other object code files that come with the compiler and an executable file (or program) is created. In order to execute the program, a program called the loader copies the executable file into the memory of the computer and sends an execute command to the CPU.

It should be noted that errors can occur during each step


source      +----------+      object     +--------+      executable
file   ---> | compiler | ---> file  ---> | linker | ---> file      ---+
(x.c)       +----------+      (x.o)      +--------+      (x)          |
                                             ^                        |
                                             |                        v
                                             |                    +--------+
                                             |                    | loader |
                                     Standard C Library           +--------+
                                     (stdc.lib                        |
                                          stdio.o                     |
                                          stdlib.o                    v
                                          math.o                     CPU
                                          ...)

A source file ending with ".c" contains C source code; whereas, a file ending with ".cpp" is a C++ file (note: ".C" suffix may also indicate a C++ source file). A file ending with ".h" can be both a C and C++ header file. Sometimes the suffix ".hpp" (or ".H") is used to indicate a C++ only header file.

The compiler is a program that usually consists of many phases. The first phase of compilation is called preprocessing. The preprocessor does many things, but two features that must be learned immediately are file inclusion and macro (manifest constant) definitions. After preprocessing, the compiler executes two primary steps: lexical analysis and parsing. During lexical analysis, the source code is broken up into tokens and the tokens are passed to the parser. The parser does syntax and semantic analysis, which includes the generation of object code (i.e. machine language).

The linker "combines" all object code files into an executable file (by default, named a.out on Unix systems). Typically, the object files created by your source files are linked with object files that are packaged into libraries.

Most implementations allow each step of the compilation process to be executed as a stand-alone procedure. For example, compile a source file but do not invoke the linker; execute the preprocessor only; or, invoke the linker only.

Some older compilers translate C source code into assembly language, then execute an assembler program to translate the assembly language into machine language.

Early C++ compilers (those prior to 1992) translated C++ code into C code and then executed the C compiler.

The loader reads a program (i.e. executable file) into memory. Once this is completed, it becomes a process and the CPU executes it.

{TopOfPage} {Tutorial} {online IDEs: CodingGround) | CPP.sh | jdoodle} {C at MIT} {GDT::C/C++ Resource}


Variables, Identifiers and Keywords

A variable is a location in the computer's memory where a value can be stored for use by a program.

An identifier is a name supplied to a variable. An identifier in C is a sequence of letters and digits. The first character must be a letter; the underscore _ counts as a letter. Upper and lower case letters are different (i.e. C is case sensitive). Good programming practice: avoid using leading underscores in identifiers. In addition, current convention is to start variable names with a lower case letter.

A keyword is a predefined identifier that has special meaning to the compiler and it is reserved by the language.

C Keywords

C reserves the following identifiers for use as keywords, they cannot be used otherwise.

   auto, break, case, char, const, continue, default, do, double,
   else, enum, extern, float, for, goto, if, int, long, register,
   return, short, signed, sizeof, static, struct, switch, typedef,
   union, unsigned, void, volatile, while
C9X indicates that bool will become a C keyword.
C++ Keywords

The following are potential C identifiers that are keywords in C++.

   and, and_eq, asm, bitand, bitor, bool, catch, class, compl, const_cast,
   delete, dynamic_cast, explicit, false, friend, inline, mutable,
   namespace, new, not, not_eq, operator, or, or_eq, private, protected,
   public, reinterpret_cast, static_cast, template, this, throw, true, 
   try, typeid, typename, using, virtual, wchar_t, xor, xor_eq

All of the C keywords are also C++ keywords.

Choosing Identifier Names

Choosing meaningful identifier names helps programs to be "self-documenting."

Generally, 1 or 2 character names are considered cryptic. Some commonly used short names are: c for characters; i, j, k for indexes; n for counters; p or q for pointers; s for strings; and x, y, z for floating-point variables.

Avoid using names that begin with underscores because the implementation may use names like that for its own purposes internally.

Abbreviations for meaningful names, if used, should be used consistently. For example, if nbr is used for number, then always use nbr instead of number.

A name should convey information as to how the variable is used and what type of data it will store. Example: item_cnt, lines_per_pg, max_input, buffer_size, and so on.

Extremely long names can convey lots of information, but they tend to make code difficult to read and maintain. Names should not exceed 31 characters in length.

Some linkers may make as few as the first six characters as significant.

Key Points

{TopOfPage} {Tutorial} {online IDEs: CodingGround) | CPP.sh | jdoodle} {C at MIT} {GDT::C/C++ Resource}


Primitive Data Types

The primitive data types are built-in to the language. They are also referred to as basic, atomic, fundamental, base, and so on.

The C and C++ languages support the following primitive data types.

* The bool primitive data type is in C++ only; however, it has been added to the newest versions of C.

The   char  short   int  long   primitive data types are  integral  data types; whereas,   float  double  long double   are  floating-point  types.

When you define a variable, memory is allocated.

The amount memory allocated is implementation-dependent. For example, on system A an int may take 4 bytes; whereas on system B is takes 2 bytes.

The amount of memory allocated dictates the minimum and maximum values that can be stored in variables.

{TopOfPage} {Tutorial} {online IDEs: CodingGround) | CPP.sh | jdoodle} {C at MIT} {GDT::C/C++ Resource}


Constants

Constants are values that are set at compile-time and cannot be changed at run-time (i.e. they are immutable).

Each constant value is defined to be a specific type.

By default, an integral number is type int. If it is too large (or small) of a value to fit into the size of an int, then it is type long.

By default, a floating-point number is type double.

Character constants are treated as small int values. For example, on some systems, the numeric value of 'A' is 65. The numeric values that are used to represent characters depends on the character set used by the system. [Some popular character sets are ASCII, EBCDIC, and Unicode.

{TopOfPage} {Tutorial} {online IDEs: CodingGround) | CPP.sh | jdoodle} {C at MIT} {GDT::C/C++ Resource}


Introduction to the cout Object

In order to use the C++ Streams I/O, the iostream.h header file must be included. In addition, inclusion of iomanip.h is often needed.

   #include <iostream>
   #include <iomanip>

Upon program startup, the cout object of class ostream is instantiated.

Output to the terminal screen (or the standard output) is performed using the cout object, and the left bit-shift operator << in combinations.

   cout << EXPR;
   // in many instances, EXPR needs to be inside ()'s

Multiple values can be output by executing the following.

   cout << EXPR << EXPR << EXPR << ...;

Only one EXPR is allowed after each << The << points in the direction of the data flow to cout (the standard output) and is referred to as the insertion operator.

A cout object does not automatically place spaces between values being output, and it does not automatically add a newline at the end of its output.

Both "\n" and '\n' result in a newline being printed to the standard output. In addition, the endl I/O manipulator can be used to print a newline. Example: the following statements all print a newline followed by the value of an EXPR followed by another newline:

   cout << endl << EXPR << endl;
   cout << "\n" << EXPR << "\n";
   cout << '\n' << EXPR << endl;
   cout << endl << EXPR << "\n";
   cout << '\n' << EXPR << '\n';

C and C++ I/O can be mixed on a per-character basis, but to ensure data is sent/received in the proper order a call to sync_with_stdio() should be made.

{TopOfPage} {Tutorial} {online IDEs: CodingGround) | CPP.sh | jdoodle} {C at MIT} {GDT::C/C++ Resource}


About Escape Sequences

Escape sequences are used for the following reasons.

C and C++
  \a  Alert   \b  Backspace   \f  Formfeed
  \n  Newline   \r  Carriage return   \t  Horizontal Tab
  \v  Vertical Tab   \"  Double quote   \'  Single quote
  \\  Backslash   \?  Question Mark   \0  Null Character
There are two types of numeric escape sequences: octal and hexadecimal.
Java Escape Sequences
Java will not compile a source file if it contains invalid escape sequences. [ Example]

  \b  Backspace   \f  Formfeed   \n  Newline
  \r  Carriage return   \t  Horizontal Tab   \"  Double quote
  \'  Single quote   \\  Backslash   \uhhhh  Unicode; 4 hexadecimal digits
  \ooo  C style; 3 octal digits

All of the Java escape sequences -- except for Unicode \u -- can only be used within string and character literals.

   String doubleQuote = "\"";
   char singleQuote = '\'';
   char bestGrade = '\u0041';    // set to 'A'
   char worstGrade = '\106';     // set to 'F'

{TopOfPage} {Tutorial} {online IDEs: CodingGround) | CPP.sh | jdoodle} {C at MIT} {GDT::C/C++ Resource}


Arithmetic Operators

All arithmetic operators are binary operators (i.e. they take two operands).

   *     multiply    
   /     divide
   %     modulus (remainder)
   +     addition
   -     substraction

Important points.

{TopOfPage} {Tutorial} {online IDEs: CodingGround) | CPP.sh | jdoodle} {C at MIT} {GDT::C/C++ Resource}


Home Previous Next