Advertisement

C# Workshop - Week 1 (Ch. 1 & 2) - Advanced

Started by July 01, 2007 12:15 AM
337 comments, last by paulecoyote 17 years, 2 months ago
Quote: Original post by JWalsh
Shawn,

Thanks for the honest post. If people dont express confusion or a lack of understanding, then everyone just assumes other people are understanding the material, and are afraid to ask. Myself, Sam, Washu, and the other people who are volunteering their time to help people with C# are happy to answer any C# questions, regardless of how "newbish" the questions may seem. We all have to start somewhere, right? If you are feeling confused about something, please ask questions. That's the only way to learn.

At any rate, let me take some time to cover a few key terms and definitions to help people get started.

Computer Program: A computer program is just a series of instructions that the computer follows in a pre-defined sequence, in order to accomplish something intended by the program author. In other words, a program is just us telling the computer what to do.

Programming Languages (A history): Initially computer programs were developed directly in machine language. This was represented as 1's and 0's in punch-cards and magnetic tape. After a while, programmers realized that rather than having to tell the machine how to do things in its language, they could actually tell the computer what to do in a language closer to their own. But in order to do this, they needed an interpreter of sorts. So they wrote a program (in 1's and 0's) which took instructions in something a little easier to remember, and converted it into machine language. This was how assembly language was born.

Unfortunately, assembly language is still very difficult to work with, as every little minute operation is its own instruction. Also, assembly language tended to be highly dependent upon the underlying architecture of the hardware. In other words, you pretty much had to know how a processor worked internally in order to program it. Once again the brilliant minds of their time got together and realized that could create an even higher level language, which looked as close to their native language as possible, which could then be compiled into assembly language instructions for the indicated hardware, and finally assembled into machine code. This was how High Level programming languages were born.

The most common "high level" languages of yesterday were C, COBOL, Pascal, Fortran, and Basic. And while each of these languages still experiences some popularity in different circles, they had one dramatic drawback. They were procedural. Although they, C in particular, enjoyed quite a bit of popularity, programmers found it difficult to program even in these high level languages because the data they were working with wasn't necessarily connected in an intelligent way to the operations they were performing on that data. They might have a set of student records, but any changes to those records must be made by external functions which took the records as input, modified them, and then allowed the caller to obtain the modified records. This was a hassle. Also, because data was frequently global, and passed around like a bad STD, debugging applications was incredibly difficult, and maintenance was expensive and time consuming. This was how Object Oriented Programming languages were born.

Only about 10 years after C became popular C++ was invented. This was a breath of fresh air as for the first time ever (commonly so, anyways), programmers could group their data together with their functions in an intuitive way. They called these groupings "classes". There was encapsulation which allowed programmers to hide the implementation details from others using their objects, and there was inheritance, which allowed others to assume code they had already written and extend it with their own. All was glorious...except, programs and libraries written in C++ dont play well with programs written in Visual Basic or any other compiled language. This is where Component Oriented Programming was born.

Microsoft realized that the ultimate solution was the creation of shareable components, which could be written in any language that abide by a set of common rules. So they set out to create a number of different technologies which might allow this, finally ending up with Microsoft .NET. And here we are today.

.NET: This is a technology made up of two components - The Common Language Runtime and the .NET Framework Library. The point of .NET is the creation of shareable components, in any .NET compatible language, which can be used to build sophisticated desktop, web-based, and enterprise applications. It is really just the next evolution in the age-old goal of software engineering, which is to develop software without having to constantly re-invent the wheel.

Common Language Runtime (CLR): The Common Language Runtime is the heart of .NET. It is a sort of virtual machine that Just-In-Time compiles Microsoft Intermediate Language (MSIL) into machine-specific code. It also handles garbage collection and system security to make sure unsafe code cannot be executed.

Microsoft Intermediate Language (MSIL or IL): MSIL is code which is partially compiled. It's not compiled all the way down to machine language, but it has been checked for proper semantics, syntax, and has been brought down into a common set of instructions which is independent of any particular programming language. Any .NET compatible language must adhere to the Common Language Specification, which is a detailed document that describes what the output must be, and what functionality must be exposed, by any programming language and compiler which attempts to be .NET compliant. Once this has been accomplished, compiling the language turns it into IL, which is then run by the CLR.

Assembly: Any Intermediate Language file is an assembly. Assemblies typically come in two forms - applications and libraries.

C# Application: A C# application is any application who's instructions were initially written in C#. Incidentally, once the application is compiled, it is turned into Intermediate Language anyways, and it is technically possible to write programs directly in Intermediate Language - though this would be time consuming.

C# Library: A C# library is any library file who's instructions were initially written in C#. As with applications, once the code-base has been compiled it is turned into IL.

Namespaces: Whenever a programmer writes instructions for a computer, they do so using variables which they create. These variables have names, called identifiers, and they have types (see below). To make sure the names of variables, as well as their types do not collide with other peoples identifiers and types, they are grouped into namespaces. Think of it this way...lets say my name is Adam. At my university there may have been hundreds of Adams. How would someone know who I am by name from someone else with the same name? Simple, by also using my last name. If you were to say "Adam Smith", then people would immediately know you didn't mean "Adam Johnson". In other words, the last name of the individual helps to qualify WHICH individual is being referred to. Namespaces are the exact same thing. By grouping types and identifiers into namespaces, we are in essence giving them a last name. And whenever there is confusion about which type to use with a similar name, a programming can fully qualify it by identifying the namespace.

Ex.

JWalsh.Collections.Vector is different from JWalsh.Math.Vector. Even though both types are called "vector", by grouping them into namespaces the compiler can identify which vector I'm referring to.

Types: Every variable in C# has a type. A type identifies how much memory a variable needs, and how the data in that memory should be interpreted. For example, in C# if I specify something as type "int", I'm identifying it as an Integer and needing 4 bytes of memory. Because C# is a "Strongly Typed" language, once a variable has been declared a specific type, only other data of the same type can be stored at that address in memory.

Reference Types vs. Value Types: Some types such as integers, floating point values, and structures are value types. This means they are allocated on the stack (the fastest allocatable memory location), are quickly created, and quickly destroyed. They are meant for "local" variables which will be used only for a short period of time. Whenever you pass a value type into function, or assign it to another variable, a copy of the data is made. This is usually not a problem as, in general (except for structs), the size of value types are very small.

In contrast, reference types are usually larger and more complex. They are allocated on the heap, which is a bit slower, and intended for long-term use. Whenever you pass a reference variable around, rather than copying the data at the address in memory, it instead just copies the address.

Lets come up with a wacky example. Lets assume for a moment that a pet is a value type, and a human is a reference type. If I were to give my pet to someone, I would actually give them a clone of my pet. Now we both have an identical looking pet. But we know better to become too attached to our pets dont we...yessss...so when we're done using our pets, we just toss them out, and re-use their food bowls for our next pet. =)

Humans on the other hand cannot be cloned. (it's illegal...seriously). So instead of cloning people, we just go hang out with our other friends while leaving our phone number or address with our other friends. So we're only physically located at one place at a time, but everyone who knows us can still contact us. (yeah, that was all a crappy, convoluted example. Sorry)

Statements: Statements are any instruction you give to the computer. These can be memory allocation, branching, looping, or even mathematical expressions. For easier understanding, any line which ends in a ';' or has a set of '{ }' is probably a statement.

Expressions: An expression is just a statement which results in a value. ex. int myInteger = 2 + 3; that is an expression who's value is 5, and is assigned to MyInteger.

Structs: Structs are user-defined value types. They are useful for things which are less capable of being represented by a single value. For example, if I asked you how much money you had in your pocket, you COULD just tell me the numerical value. ie $1.38. But if I asked you which coins you had in your pocket, you'd now be unable to express it as a single value. Instead, you'd need to provide me with a set of related values. The above example might be 5 quarters, 1 dime, and 3 pennies. If I wanted to represent that as a structure I could say:
float Money = 1.38;struct Coins{    int CountOfQuarters;    int CountOfDimes;    int CountOfNickels;    int CountOfPennies;}Coins myCoins = new Coins();myCoins.CountOfQuarters = 5;myCoins.CountOfDimes = 1;myCoins.CountOfPennies = 3;

Classes & Objects: Classes are similar to structs, in that they serve a similar purpose. They are designed to group data together. However, classes are much more sophisticated. In specific, they are reference types rather than value types and support class inheritance with polymorphism. Whenever you create an instance of a class, it is called an object.

Members: All items within a struct or class are its members. These include variables, functions, properties, and events.

Methods: This is a fancy name for a class's functions.

Fields: This is a fancy name for a class's variables.

Properties: These are C#'s setters/getters. They are a new construct which looks like a Field, but behaves like a Method.

Events: These are members of a class which make notifications possible. How/why these are used will become more obvious later.

Operators: These the symbols seen in C# source such as +, -, /, (), etc...and perform operations upon their operands.

Base Classes, Inheritance and Derived Classes: Sometimes you, or someone else, has written code which you'd like to take advantage of, but don't want to have to go through the trouble of duplicating. In this case, you can derive a new class from an existing class. The class which already existed is the "Base Class", while the new one you're creating is the "Derived Class". The process of basing a new class upon an existing class is called "Inheritance", and simply means that all public and protected members (except for the constructor) of the base class are now part of the derived class.

All of the above, except for my history lesson, is contained and presented in a far more elegant fashion within the text. If you're having trouble with a specific subject, feel free to ask questions, or request aid on identifying where in the text a subject is covered in the most enlightening fashion.

Hope this helped!


Wow I always thought of "members" as data members of a class.... and "methods" were functions of a class... It seems redundant to group Methods and Fields into a bigger classification of Members... Maybe its just because I've thought that for so long.
-durfy
Quote: by Jwalsh
Members: All items within a struct or class are its members. These include variables, functions, properties, and events.

Methods: This is a fancy name for a class's functions.

Fields: This is a fancy name for a class's variables.

Properties: These are C#'s setters/getters. They are a new construct which looks like a Field, but behaves like a Method.

Events: These are members of a class which make notifications possible. How/why these are used will become more obvious later.

Quote: by DvDmanDT
Here Character is a class, that is a type of variable (a type of object). It has 4 members, x, y, Name and GotoPosition(). It has 3 fields, Name, x and y. It has one method, GotoPosition().

A function is a piece of runnable code (you can't run a class definition, since it doesn't "do" anything, but you can run a function since it does something).

A method is a function in a class.

Usually when someone say function, they mean a function not tied to a class. In C#/.NET, there are no functions in that sense, there are only methods.


Quote: by jpetrie
Not interchangeably, no. In the context of a class, a field is a member, but not all members are fields.

"Member" is a just a more-generic term referring to something that is a part of something else - common parlance includes terms like "member variables" ("fields"), and "member functions" ("methods"). Things that are part of the class are members of the class.



Those three items cleared up quite a bit. Thanks people.

Shawn

Advertisement
Quote:
Statements: Statements are any instruction you give to the computer. These can be memory allocation, branching, looping, or even mathematical expressions. For easier understanding, any line which ends in a ';' or has a set of '{ }' is probably a statement.

Expressions: An expression is just a statement which results in a value. ex. int myInteger = 2 + 3; that is an expression who's value is 5, and is assigned to MyInteger.


Umm.. Isn't that a statement?

In my world, an expression is more "2 + 3" or even "myInteger = 2 + 3", but not "int myInteger = 2 + 3;" which is a statement. A statement is often built using expressions, so a statement often contain one or more expressions, but you can't use a statement as an expression. Is my definition wrong?

You can use a statement as an expression cant you?
if (MyObject =& Engine::getObjectInstance())?

or

if (int i = 3+3) -- always true but evaluates nonetheless
-durfy
No, not valid in C# I think. You can assign to a variable in an expression, but you can't declare a variable in an expression. I think. Also, inside ifs, whiles and so on, everything inside the ( and the ) must evaluate to a bool.

int i;
if((i = 10) > 4) // valid


if((int i = 10) > 4) // not valid I think, but I haven't tried it


if(10) // not valid in C# (can't implicitly cast int to bool or something like that), but valid in some other languages
You are probably right i'm really new to c#
-durfy
Advertisement
Quote: Original post by JWalsh
Shawn,

*snipped*

Hope this helped!


Wow that helped me out so much, you have no idea. XD I was so lost when reading the C# spec, but after that it seems to all click together now. Thanks!
I'm pretty new to C# myself, but I do have some experience with C++, and I assume expressions and statements are the same thing in both languages.

NOTE, the following text refers to C/C++, and not to C#, even though parts of it _might_ be true for C# as well.

I know they changed the syntax of the for-loop from three expressions to one statement and two expressions to allow

for(int i = 0; i < 10; i++)

instead of

int i;
for(i = 0; i < 10; i++)


that is, they changed

for(expr ; expr ; expr)

to

for(statment expr ; expr)
Quote: Original post by Menace2Society
What do compile-time types and runtime types even mean?

This is kinda hard to explain in a few lines, so here goes...

Programming consists of manipulating values: you do stuff with things. Example:
5 - 3           // subtracting numbersfunc()          // invoking a function named func"abc" + "def"   // concatenating strings


But what happens if I'd write this
5();            // invoking a numberfunc + func     // adding functions"abc" - "def"   // subtracting strings

These don't make any sense. Operations can only be applied on objects of certain types.

This can be checked at runtime: while your program is running, it could check if the two objects it received have the correct type:
// Implementation of operator +:function +(x, y){    if ( x is int && y is int )       ... // add numbers together     else if ( x is string && y is string )       ... // concatenate strings    else       error!}

Languages working this way are called dynamically typed. Python, Scheme, Common Lisp, Oz are examples. Problems with this approach: it's slow, and errors are only detected at runtime.

We can do better by using a static type system. This consists of adding type information to your code, making it possible for the compiler to perform those checks in advance. This would mean all those runtime checks become unnecessary plus you can be certain that that specific kind of errors (i.e. applying operations on wrong types) won't occur, ever. Languages such as C#, Java, C++, Eiffel, ... have static type systems, and are called statically typed languages.

Now, you can go very far using type theory. Systems like Coq can even prove your code to be completely bug-free (not just detecting type errors), but these systems are rather hard to use. So, concessions have to be made. One of these concessions is using the concept of compile-time types and runtime types.

First, consider the following type hierarchy:
SomethingA numberAn integerA positive integerA prime numberThe number 5

Something offers no information whatsoever. It could be anything. A number is more specific already, and we know we can use + and - on numbers. Even more specific is positive integer, which allows us to take the square root of the value. We can go even further with a prime number. This is more informative than any of the previous types. The last type, the number 5 is the most specific of them all (yes, this is also a type, but there's only one value with that type, i.e. 5).
Notice that each type is a subtype of the one before it: 5 is a prime number which is a positive integer which is an integer which is a number which is something.

Having precise types is a good thing, as it would mean the compiler can perform more thorough analyses. Let's see now how operations would be defined... we get into trouble here.

Let's try to work with the ultraspecific types the number 1, the number 2, etc. This would mean we have full information, and the compiler could evaluate your entire program for you. This would be nice if programs wouldn't have that nasty habbit of depending on external data: user input, files, ... Concrete example:
number_5 x = read_5_from_console(); // user is only allowed to input 5

This means we need to go back a level of specificity.

If we were to work with types like prime number: using multiplication would be easy:
prime    * prime    = nonprimeprime    * nonprime = nonprimenonprime * prime    = nonprimenonprime * nonprime = nonprime

But what about addition? 2+3=5, so prime+prime=prime. But 13+2=15, meaning prime+prime=nonprime. Oops. Prime does not contain sufficient information to be able to provide a valid +, so we must discard it.

positive integer, now that looks promising! But if we look at the subtraction operator... 5-3=2, so posint-posint=posint, but 3-5=-2, meaning posint-posint=negint. Damn.

It seems we have to fall back to integer... but 8/2=4 : int/int=int, and 8/3 = 2.6666 : int/int=float. This is getting frustrating.

Number seems to be our last hope. But if we divide by 0, we don't get a number as a result... We're screwed.

Is something really the best we can do? Luckily, no. We can combine the different levels of specificity. So, if we try our examples again:
prime + prime == positive integer     // we go down one level of specificity, but a least it's always correctposint - posint = intint / int = something


This brings us to compile-time type and runtime type. Finally. Let's take our positive integer-subtraction example again: the only thing you can be sure of beforehand is that posint-posint=int.

If we have the code 5-3, it will of course evaluate to 2. But 2 is actually a positive integer. If we were to create it in C#, we'd use new PositiveInteger(2); or something like that. So, in the actual case of subtracting the posints3 from the posint 5, the result will be another posint. But we only find out after the computation, at runtime.

So, intuitively, we can say that the "runtime type" is the "real type" of the object, and the "compile-time type" is the one the compiler has to use for error checking. The compile-time type is some sort of "safe guess": if it could be either a posint or a negint, let's settle for an int. This leads to an important rule: the compile-type type must always be a supertype of the runtime-type, meaning the type must be valid for all cases, whatever the actual values used. E.g. if subtraction could return a string for some strange reason, then the possible outcomes for the runtime type are posint, negint and string. The compile-time type would then have to be something. Note: a type is considered to be a supertype of itself, so the compile-time type can be the same as the runtime-type.

So, if we were to write all this in C#, it would look like (a lot of details omitted for simplicity's sake)
class Something { ... }class Number : Something { ... }class Integer : Number{}class PositiveInteger : Integer{    static Integer operator -(PositiveInteger x, PositiveInteger y) { ... }}class NegativeInteger : Integer { ... }


So, as an attempt of summarizing it all: compilers try to prove correctness, but need to make some concessions by throwing away valuable information. This leads to the separation of the runtime-type (the true type of an object) and the conservative approximation (compile-time type).

Another example:
Integer x = new PositiveInteger(5);

The runtime-type is positive integer. When you use new to instantiate a class C, the runtime type is always C.
The compile-time type is integer. When working with x, the compiler will only know it is an integer which might be either positive or negative. We know better however, but that piece of information has been thrown away. In order to keep this information we can of course write
PositiveInteger x = new PositiveInteger(5);

However, it is often unavoidable that information will "seep away" by applying operations on x, such as subtracting another posint from x:
PositiveInteger x = new PositiveInteger(5);PositiveInteger y = new PositiveInteger(3);Integer z = x - y; // we fall back on Integer, which is unavoidable

However, casts let us tell the compiler "trust us, we're sure about this type":
PositiveInteger x = new PositiveInteger(5);PositiveInteger y = new PositiveInteger(3);PositiveInteger z = (PositiveInteger) (x - y);

For safety though, casts are always checked at runtime. So, if x-y were to return a NegativeInteger (e.g. x=3, y=5), the cast would be invalid as negints are not posints, and an exception will be thrown. Remember that the compile-time type (here PositiveInteger must be a supertype of the runtime type, which happens to be NegativeInteger in our case).

I hope this kind of explains the difference and the need between compile-time and runtime-types and that it didn't just confuse you more.

A lot of quirky details have been left out, such as why int/int does give an int (C#, C++, java, ...) or at least a number, or that uint-uint does give a uint (uint is an unsigned integer, i.e. a positive integer). But that's possible because the languages cheats.

[Edited by - SamLowry on July 3, 2007 2:25:47 PM]
Quote: Statements: Statements are any instruction you give to the computer. These can be memory allocation, branching, looping, or even mathematical expressions. For easier understanding, any line which ends in a ';' or has a set of '{ }' is probably a statement.

Expressions: An expression is just a statement which results in a value. ex. int myInteger = 2 + 3; that is an expression who's value is 5, and is assigned to MyInteger.


These two lines from my post before seem to have caused some confusion/discussion. Allow me to clarify.

As indicated above, a statement is any instruction given to the computer. This means that anything, and everything you enter into a program's source code is either a statement itself, or part of a larger, compound statement.

Now, here's where the confusion came in. An expression is any piece of code which returns a value. Period.

examples:          a + b    myObject.GetValue()    x++    new MyObject()        myObject as Object    ...


All of the above are expressions in C#, as each returns a value. Now, in C#, you cannot have an expression which is not part of a statement because...wait for it...everything must be part of a statement. So although writing "a + b" satisfactorily demonstrates an expression, it's illegal in C#. That, by itself, will cause a compile error. To make it valid, it must be included in a statement.

Interestingly enough, there is a type of statement called an "Expression Statement." In my example above I included the expression in an expression statement to demonstrate an expression in the context of something legal in C#. Lets look at it again...

Quote:
Expressions: An expression is just a statement which results in a value. ex. int myInteger = 2 + 3; that is an expression who's value is 5, and is assigned to MyInteger.


"An expression is just a statement which returns a value"...check
"That is an expression who's value is 5"...check (2+3 is the expression)
..and is assigned to MyInteger...check (the assignment operator '=' makes it an expression statement)

So, summary?

All instructions are statements. Statements come in different flavors including variable declaration, if-statements, switch-statements, while statements, return statements, and even expression statements.

Expressions are any piece of code which returns a value, and must be part of a larger statement. When the whole purpose of the statement is to return a value, it is an Expression Statement.

Cheers!
Jeromy Walsh
Sr. Tools & Engine Programmer | Software Engineer
Microsoft Windows Phone Team
Chronicles of Elyria (An In-development MMORPG)
GameDevelopedia.com - Blog & Tutorials
GDNet Mentoring: XNA Workshop | C# Workshop | C++ Workshop
"The question is not how far, the question is do you possess the constitution, the depth of faith, to go as far as is needed?" - Il Duche, Boondock Saints

This topic is closed to new replies.

Advertisement