Tip:
Highlight text to annotate it
X
Hello and welcome to the second video in this series made for 24 hour answers, where we
look at becoming a better programmer. This time we're going to be examining basic data
types and how you integrate these into your programs, some key differences between seemingly
similar types, modifiers you can use with these types, and how you might manipulate
these data types as variables. You will find most of these data types in one form or another
in any programming language, but we will be focusing on C and C++.
Rather than start with program code, I'm currently displaying a list of the data types and a
brief description. We'll go over these now before looking at using them in the wild.
Integers, type int, are single whole numbers. By default they can be either positive or
negative. Integers take up 4 bytes of memory, giving them a range from -4 billion to +4
billion. Generally speaking you will use these to represent numbers in systems where you
need the flexibility to represent more than a limited range of data - for example, the
number of entries in a database, counting large repetitions of loop statements, and
any other kind of ranking, score or number you may encounter. They are probably the most
common numeric type.
If you find yourself requiring even larger numbers, you can employ what's known as a
'long long'. The naming of this type comes from the fact that a long is data type typically
the same size as an integer; as such, this is just a large integer and has the same constraints;
only whole numbers. This type uses 64 bits or 8 bytes and ranges from negative to positive
2 to the power of 63. It will almost certainly be more than big enough for any purpose you'll
have in common programming.
In the other direction, you can economise with smaller data types and use a char (short
for character). These types are designed to represent single letters, but can be used
as numbers just as easily - to the system and compiler, the letter 'a' and the number
97 are in fact the same (if you're interested, the number 65 corresponds to an uppercase
'A'). Using 8 bits or a single byte these values range from -128 to 127. Later on we'll
see how to use arrays of chars to represent character strings in C, and how to do the
same in C++.
These are the main integral types. Alongside these are floating point numbers, that is,
numbers that contain a decimal point. These are handled by two types, the float and the
double, which refer to single and double precision floating point numbers. The float reserves
23 bits to represent the fractional part of the number. This gives you an effective precision
of 7 decimal places. In general you'll find this is useful for storing information like
currency when completing programming assignments, but due to rounding errors this would be very
bad practice in industry, particularly in the financial sector - though I can assure
you it still happens!
The double type offers you 16 decimal places and a larger range. If you're performing or
calculating fractional arithmetic, this is the safest type for you to use, but bear in
mind the limitations of this numeric type - it is still prone to errors in rounding.
If you make it into industry as a programmer you may find sectors where even more precise
mathematics are required - like programming aviation or safety equipment - and in this
spirit there are even larger or more precise data types available. This of course is beyond
the scope of this video.
The next point for consideration are the modifying keywords you can use with these data types.
I'm going to focus on one in particular, unsigned. The others, short and long, are less commonly
used, and you'll find often that certain type/modifier combinations are identical to others. The
unsigned keyword, however, is useful and distinctive. It converts a type from allowing a negative-to-positive
range (like -128 to 127) to permitting only positive numbers. Thus an unsigned char becomes
0 to 255; an unsigned integer becomes 0 to 8 billion instead of 4; an unsigned long long
becomes 0 to 18 quintillion, and so on. This is useful for squeezing larger values out
of the same amount of memory. Whilst it's very easy given the power of modern computers
to simply throw around the largest data type available at any point, it is good practice
to bear in mind there is always an optimally efficient way of writing a program, so as
to use the least memory or the fewest operations. Making the correct judgements will mean you
can write more robust, terse code that achieves the same goals more efficiently than others.
Now we've seen these types, let's look at how to use them in actual programs. The program
on screen here is very simple; it creates each of the data types we've mentioned, sets
them to a value, then prints their value to the screen. This also expands upon the printf
method to show you how you can display these values. The code here has been written in
the main function which you should remember from last tutorial.
This code is broken into blocks and makes use of comments, which we haven't covered
yet. These double forward-slashes indicate text that the compiler will not use; it is
designed to be read by people and not machines. Getting into the practice of documenting and
commenting your code is a great head start if you wish to become a programmer in industry
and such 'good practice' may help distinguish you as an interview candidate. These lines
will be ignored when the program is compiled and executed.
The first code block demonstrates simply printing an integer. You will recognise the printf
method, but it has some changes from last time. Now instead of just printing a message,
we are printing a token - the percent symbol followed by the letter - and after this message
is a comma with our variable. This means the function now takes two arguments - the string
format, and the variables of that format. In fact, printf can take any number of tokens
like this with any number of variables, one for each token.
Speaking of variables, between our two printf lines is our first variable statement. This
is actually broken into two parts. Firstly, we have the variable declaration - "int myInteger".
This tells the compiler to create an integer and name it myInteger. You can use any name
for your variable consisting of letters, numbers and underscores, though you cannot start with
a number. The second part of the line is the variable initialisation, where we set myInteger
to store the number 10,000. You must always remember to initialise variables you have
declared; after forgetting to end lines in semi-colons this is probably the second most
common mistake you will encounter. It is, however, much more dangerous; before initialisation,
the variable could contain literally anything - it's just a chunk of memory on your system.
To make matters worse, use of an uninitialised variable will not cause a compilation error
- in fact, it may not even cause outwardly noticeable problems at all, so take care.
That aside, these lines declare an integer, set the value to 10,000, then print it.
The next code block, labeled 'char', demonstrates the interchangeable nature of char types.
The myCharA variable is initialised as the letter A. Note the use of single quotation
marks; this was touched upon in the last lesson. If you want to use characters, set them with
single quotation marks. Character strings, on the other hand, require double quotation
marks. For the record, escape characters like our backslash-n are classed as single characters
when they appear on their own. myCharB is initialised as the number 65, which is the
numeric value of A. The next four printf lines require some explanation; the first two print
both myCharA and myCharB as integer types - hence the use of %i. The two lines after
print the chars as character types, which is why %c is used instead. When we run the
program, you will see the number 65 printed twice, then the uppercase letter A printed
twice.
The next block, demonstrating long long, is simpler; it simply demonstrates that we can
use very large numbers with long long types. The token to print this variable type is %lld.
The double/float variable block is interesting. Both myFloat and myDouble are initialised
to the same value - .50000000002. Because the float has only 7 characters of precision,
when you run this program it will display only .5. The double, on the other hand, will
display the full number, including the 2. The %.12f token indicates we want floating
point numbers to 12 digit precision, enough to demonstrate this concept.
Finally I have shown the unsigned modifier at work. Firstly we have an unsigned char
set to 250; this value is larger than a regular, signed char could hold. Secondly, we have
attempted to set an unsigned variable to a negative, to see what happens. When you run
this program, you'll see that the -1 is printed instead as 255. Play around with different
negative numbers and see how they get converted. This also gives you an indication as to how
the computer stretches out the same amount of memory to include twice as many values.
Note for unsigned variables, we print with %u, and that all these tokens are case sensitive.
And that's about it; I've shown the program output here. Try to replicate this program,
change the values, add or remove the unsigned modifiers, and if you're feeling ambitious
try putting multiple tokens and several variables into a single printf statement to see what
you come up with. As always, feel free to post any questions below.