Ballad of a Duck

From Compsci.ca Wiki

(Difference between revisions)
Jump to: navigation, search
(Modules)
Line 76: Line 76:
Any module may, for convenience's sake, be opened, allowing its contents to be directly accessed.  This is the case with Pervasives, but otherwise use of this facility should be with discretion.
Any module may, for convenience's sake, be opened, allowing its contents to be directly accessed.  This is the case with Pervasives, but otherwise use of this facility should be with discretion.
 +
 +
As with functions, modules are easily created and experimented with.  We will use them extensively as we go.

Revision as of 06:37, 18 October 2011

Contents

Preface

Why do we bother to write computer programs at all? Surely there must be a reason to undertake such a task. In fact, I believe there to be several.

First and weakest of these I think is our desire to simplify a task. Programming can greatly alleviate the tedium of simple, repetitive processes. I can confidently say I think this is the reason about 99% of computer programs have been written, but also that it's not a particularly strong reason why people have done so and particularly not why they continue to do so.

Existing tools do such a good (or at last competent) job of this that there is little continuing reason to program such tools from the ground up. In fact, many programming tasks amount to just slightly modifying these existing solutions.

Learning a programming language and computer system well enough to deliver a competent automation tool and then the task of actually doing it are not simple or easy. They require years of study, and any serious endeavor likely entails years of fine-tuning. A great deal of tedium can be justified in place of this expenditure of effort.

No, we cannot reasonably say that people go into computer programming just to make their lives easier. That job has been accomplished, and even had it not, the personal payoff is very rarely worth it.

However, we are creatures of pride, and we frequently see opportunities to improve on or best the work of others, especially when presented with a solution that is at best competent. Now, of course we may never succeed at doing so, but our belief that we can keeps us trying.

One need only look at the proliferation of "standards" to see this in action. How many instant messaging protocols does the world need, really? And yet, people keep striving to better the existing stalwarts. This is a good thing: it keeps the world of computers fresh and interesting and (generally speaking) continually improving.

Pride is a good reason to program, but also a dangerous one. A prideful programmer can become convinced that his or her creations are superior regardless of their objective merit.

Of course, it's possible to offset this hazard by filling an unmet need. This is an increasingly rare, but powerful motivation. Imagine the drive to create a complete cross-platform office suite for open source computing platforms. Those programmers had the pride to say they could do it, and nothing to compare against to demotivate them.

But the most important reason we write computer programs is the simplest, and one that cannot easily be discouraged by the body of existing work, or even a lack of overwhelming pride in ourselves. Quite simply, we write computer programs to find out if we can. We write more to find out how far we can go.

This is why I began programming years ago. I'm not sure I've yet found a limit. What will yours be? I hope you'll never find out, but if you want to push yourself, keep reading.

My Approach

I am a self-taught computer programmer. My knowledge has been pieced together from bits of code and often sarcastic internet rants about best practices, and my willingness to ask questions.

I will try to guide you, the readers, through some of the basics, but I expect you to go out and explore on your own, and answer the basics. Don't worry, I'll help you with the details.

If this approach bothers you, then stop reading here. I don't believe in spoon-feeding you information. Anytime I've done that I've never been able to tell who was learning and who was just regurgitating code. If you see yourself in that latter description, try to change. If you can't, there are myriad options out there that will give you the easy answers you seek.

Language

Lots of computer programming books teach you how to write code in Java or C++ or C# because those are useful languages and all basically the same. They are eminently useful for streamlining repetitive tasks and getting work done, which we know is what computer programs are written to do. Why shouldn't they teach these languages to you?

They shouldn't because that's not why we learn to write computer programs. Sure, C++ can be a huge challenge, but not the way it's taught or used by most, and many of its challenges lie in minutiae. Java and C# are comparatively boring languages. Most of their mastery lies in knowing which library to use and remembering long names.

All of those languages deal with a very simple concept: we have a piece of data X and we tell the computer to do Y to it, then we tell it to do Z to it in a linear fashion. The piece of data keeps getting changed until it's what we need. This is what we typically refer to as "imperative" programming, and it encompasses the majority of widely used programming languages.

But we often hear programmers talk about "functional" programming, and they call it difficult and present it as generally unfathomable. We still transform data in a functional programming language, but it encourages us to find different ways to accomplish that goal, and the conventional wisdom is that this is too difficult.

It's for precisely this reason that a functional programming language will be my choice. I have made a choice that I firmly believe represents the best balance of characteristics: it has no confusing fragmented implementations or libraries, and yet supports solid tools. It also boats the advantage of a more direct syntax for expressing a number of concepts that the aforementioned mainstream programming languages have to come at in rather oblique ways.

I want you to continue reading because you like that people have said functional programming is too hard; you like proving them wrong and proving to yourself that you can do it.

Data

I have said that programming at its heart is transforming one piece of data into another. It behooves us, then, to start out by taking some time to look at what data really is.

Data are all of the tiny points of information that make up life. The meat of the data world are numbers. We deal with numbers constantly, and so too must programmers.

We deal most commonly with round numbers, or what a programmer would call integers. Integers are easy and direct. We can relate them to solid everyday objects, and as in written communication, integers are very easy to symbolize in computer programs.

Less easy are non-whole numbers, or as programmers would say, floating-point numbers. These are fractions of things: 3.4 km to the library; half a pizza left in the fridge. Humans don't usually have much difficulty working with these kind of numbers, but computers, with their binary brains do. In fact, while numbers like these can be represented accurately, it takes a lot of computational power to do so. To address this, computers store an approximation of these numbers which as the number increases becomes less accurate.

This will likely not be a huge obstacle at this level, but it does warrant consideration. The money example is a perfect one. If you're keeping track of money, there's a strong temptation to track dollars, and use floating point numbers to account for pennies. Knowing that you can instead count using integers to represent pennies, you should do so to avoid any inaccuracy as calculations are made on those values.

Aside from numbers, we can represent text with two different types of data. The simplest is a character. For the sake of simple Latin-based characters, characters can be thought of as simply integers between 0 and 255, where each number represents a unique character. Strings are a sequence of characters, and generally are far more useful, though necessarily take a little more getting used to working with.

Notes on Code Organization

As we explore increasingly complex programs, it will be important for us to find ways to organize our code. Two fundamental units of organization exist. These will be explored later in greater detail, but some theory is called for first.

Functions

Functions give a name to a transformation of one piece of information into another. Rather than simply writing the same code repeatedly throughout our program, we can name a given calculation. This avoid tedious retyping or copy and paste, but more importantly, lets us tell the reader of the code what we're calculating, rather than how we're doing it. As well, improvements to a given set of code can be made in one place, and the benefits apply everywhere that function is used.

Given that we will be using a functional programming language, you should expect it to be exceptionally easy to create new functions. You will also find that many existing functions are provided to accomplish basic tasks. We will use many of these, but you may also end up rewriting others to learn how they work.

Modules

As we accumulate more and more functions and data, we can expect naming to become increasingly troublesome. To keep related data and functions grouped, we will use modules.

As there are a number of existing functions, there are a number of different modules provided which help to group those functions together. The one you'll encounter most frequently is Pervasives, which includes a great number of common and highly useful functions.

Any module may, for convenience's sake, be opened, allowing its contents to be directly accessed. This is the case with Pervasives, but otherwise use of this facility should be with discretion.

As with functions, modules are easily created and experimented with. We will use them extensively as we go.

Personal tools