Archive for September, 2008
Today I am taking a huge detour from technical matters to lay out the
philosophical groundwork behind this blog. The ideas presented today
lie beneath every essay on this site. It is easy to observe that people
seem driven to formulate absolute truths to guide their pursuits.
Programming is no different, programmers are driven to find the
absolutes that will universally guide their efforts. Those absolutes
are not that hard to find, if you know the method for seeking them
out. Fortunately, we have hundreds and thousands of years
of human efforts, both successes and failures, to draw upon when
embarking upon the task.
Absolutes in the Post-Modern Age
Academics refer to our current stage of history as the “Post-Modern”
age. Thinking in the post-modern age is dominated by a
of the very concept of absolute truth. Many thinkers have noted
that in the post-modern age the only absolute is that there
are no absolutes. Now, anybody who has not
bothered to read much past what they are handed likely believes
much of this without even thinking about it, they may not know that
in the history of the human race such thinking is less than 60
But that “no absolutes” stuff is all nonsense at
best and downright cowardice at
worst. If you want an example of an absolute truth, try stepping
off the edge of a cliff: even if you do not believe in gravity,
gravity believes in you. It is an absolute truth for me that if
I do not take care of my customers my life becomes unpleasant.
It is a further absolute truth for me that I constantly obvserve
programmers proclaiming absolutes (always use relational, always
use OO, etc). When I stop observing it, then I suppose it won’t
be an absolute anymore (and I suppose then it never was?)
So let us now cheerfully ignore the wailing of those who cry
that there are no absolutes, and ask if we might discover some
elements of software development strategy that hold true always
(ok, maybe mostly always) for the
context of database application development.
Aristotle and Virtue
Nowadays nobody has to read philosophy much anymore, at least not where
I live (in the United States), so most programmers have never heard of
a man named Aristotle, who lived about 2500 years ago. This is a shame,
because Aristotle had a logical way of thinking about things that
would warm the heart of any programmer.
One of Aristotle’s major contributions to civilization was his
formulation of what philosophers call “virtue”. Philosophers use the
term in a technical sense, and they do not use “virtuous” to mean
“nice” or “pleasant” or “good-natured.” To a philosopher (or at least
those that taught me) something is virtuous in Aristotelean terms if
if performs its function well. The standard classroom example
is that a virtuous table serves the function of a table, and a
virtuous table maker is somebody who makes good tables.
This is a very useful concept for programmers. If we want to speak
of a “virtuous” program, we mean simply one that meets its goals.
This takes the whole high-minded theory and philosophy stuff back
to real down-to-earth terms. (This is why I always preferred Aristotle
In the quest for the absolute, if we let the ancient philosophers
guide us, we discover the surprisingly basic idea that our programs
should perform their functions well if they are to be called
virtuous. This is easy to swallow, easy to understand, and
easy to flesh out.
What is a Virtuous Computer Program?
A virtuous computer program is one that serves its purpose well, and
so we need to flesh out the three purposes that are common to most
- To meet some institutional or strategic goal of those who
sign the checks (or accept the work as charity in some case).
- To meet the goals of end-users, which almost always comes down
to performance and ease-of-use.
- To provide income for the developers (or meet their own goal of
providing charity work for non-profits).
Notice what is not on the list, things like ensure all data resides
in a relational database, or implement all code in strictly
object-oriented languages. We are not nearly ready to consider
such specific strategies as those, they are completely out of place
here in a discussion of the unifying goals of all projects.
So let’s review. So far we know that the
absolutes of programming are the pursuit
of virtue, which turns out to be a fancy way of saying that the program
should perform its functions well, which turns out to mean simply that
it should do what the check-signer asked for, in a way that is workable
for the end-users, and at a price that keeps the programmer fed.
This leads us towards strategies for reaching those goals.
The Virtuous Programming Strategy
Continuing with the idea that a virtuous program meets is basic
goals, we can say that a virtuous strategy smooths the way for
a programmer to meet the basic goals.
An unvirtuous (or just plain bad) strategy
litters the path with obstructions or ends up not meeting the
goals of the check-signer, end-users, programmer, or all of the
Before we can begin to formulate a strategy, we must look next
at the reality of the programming world. Some of the fundamental
realities include (but are not limited to):
- The end-user or check-signer may not fully understand or
be able to articulate their requirements.
- The programmer may not correctly understand requirements,
even when correctly articulated.
- In a healthy prosperous situation there will be new
requirements that interact with established requirements
in ways that range from no interaction at all to
- The world will change around you, creating demands that did
not exist when the system was created (some of us can still
remember when there was no internet).
- Staff will come and go.
- …and so on.
So even before we begin formulating particular strategies for particular
situations, we recognize that our strategy had underlying goals it
must facilitate, such as:
- Being easy to change, both for correcting mistakes
and adding features.
- Being able to maintain and sort out possibly contradictory
requirements that arise as the years go by.
- Requiring little or no “deep magic” that depends on arcance
knowledge of employees who may depart.
- Being able to expect the unexpected (like the explosion
of the web etc.)
Only after we have worked through to this point can we begin to
evaluate specific strategies and technologies. We can now begin
to ask about the proper context of the database server, where to
language. Anything that responds to our core goals and realities
can be considered for use, anything which does not play into the
core goals is useless at best and obstructive at worst.
Future essays (and some past essays) in this series will refer
back to these ideas. For example, many developers have observed
over the years that if you >Minimize Code and Maximize Data then you gain many advantages
in terms of development time, robustness, and feature count.
Other ideas similar to this will come out over and over in future
essays in this series.
The strategies and techniques that you will see on this blog are
all aimed at one way or another towards the goals expressed in
this essay. At the very beginning comes the goals of the
check-signer, the end-users, and the programmer. From there we
seek strategies that will satisfy our need to grow, change,
correct, and adapt. Only then can we ask about the technologies
such as databases and object-oriented languages and see how
well they let us meet all of these goals.
A PVII CSS Page Pack. Create up to 12 engaging CSS Layouts in Dreamweaver— Instantly!
The list below explains all topics past and future, including
slots for essays I plan to revise or add to. This posting
replaces our >Original Table of Contents, as well as our >Master Table Design Patterns List and the >List of Future Topics.
As of this writing, September 2008, this blog is now about
10 months old. I have been able to produce at least one
entry on most of the major topics that I hoped to cover, and
can now see opportunities to flesh out and expand many
topics. On top of that, I have been receiving requests for
topics both via email and comments. The table below is a
guide to what topics are forthcoming, which have been
treated in the past, and which you will likely see in the future.
|>||>The most general ideas that guide database software projects.|
|>||>The foundation of all ideas on this blog.|
|>||>There are precious few truly unique application|
|> Technology Trumps Dogma||>…future topic…|
|> The Customer is Always Right||>…future topic…|
|>||>Reviews the basic fact that it always starts and ends with the data|
|> Pervasive Security||>…future topic…|
|>||>A brief review of the mathematical basis and|
|> The Object-Oriented School of Thought||>…future topic…|
|>||>Specific practical arguments for strategies to improve|
|>||>Explains one of the most generally beneficial|
|>||>Response to various “code is data” comments|
|>||>Explains the overal practical benefits of normalization.|
|>||>The practical benefits of first normal form.|
|>||>The practical benefits of second normal form.|
|>||>Describes 3NF and introduces the problems|
|>||>Explains why full normalization is not optimal,|
|>||>Explains the three fundamental denormalization|
|>||>Review of the architecture-level strategies|
|> Constraints Minimize Code||>…future topic…|
|>||>The Data Dictionary is possibly the most powerful tool|
|>||>Overall introduction to the main benefits of|
|>||>Dictionary-based upgrades are the most efficient|
|>||>Special details to be aware of for indexes|
|>||>Shows how to specify the FETCH denormalizing|
|>||>Particular issues to watch out for when|
|> Applying Security||>…future topic…|
|> Organizing Your Database||>…future topic…|
|> When You Still Need Scripts||>…future topic…|
|> Validating a Spec and Diff||>…future topic…|
|>||>Practical information about day-to-day programming|
|> SQL Commands||>…future topic…|
|>||>The very basic clauses of the SQL SELECT,|
|>||>The cornerstone of powerful queries.|
|>||>The many forms of JOIN|
|>||>How to examine multiple rows to find|
|>||>How to use UNION, with special|
|> INSERT||>…future topic…|
|> UPDATE||>…future topic…|
|> DELETE||>…future topic…|
|> Performance||>…future topic…|
|>||>Practical method for loading large amounts|
|>||>Looks at how indexes, denormalization and other|
|> Server Side Abilities||>…future topic…|
|>||>Presents my own view (that I have not seen|
|> Stored Procedures||>…future topic…|
|> Views||>…future topic…|
|> Indexes||>…future topic…|
|> ACID Compliance||>…future topic…|
|> Transactions||>…future topic…|
|> Server-Side Security||>…future topic…|
|>||>A broad overview of how to make use of the|
|> Table Security||>…future topic…|
|> Row-level Security||>…future topic…|
|> Column-level Security||>…future topic…|
|>||>This section contains unusual or special techniques|
|>||>How to make a table automatically convert an insert|
|>||>The most important skill for a database programmer is|
|> Customer Interviews||>…future topic…|
|>||>How to apply to concept of iterative development|
|>||>Explains why requirements will never be|
|> Pattern Recognition||>…future topic…|
|> The Power of Primary Keys and Foreign Keys||>…future topic…|
|>||>Basic introduction to primary keys|
|>||>Basic introduction to foreign keys|
|>||>General review of why the concept of a constraint is|
|> Table Design Specifics||>…future topic…|
|>||>Introduces the basic table types and what|
|>||>How to handle a primary key that may change|
|>||>How foreign keys are different depending on|
|> Transaction Tables||>…future topic…|
|>||>When not all possible combinations in|
|> Cross References||>…future topic…|
|>||>Using a cross reference as a parent|
|> Specialized Patterns||>…future topic…|
|>||>How to pick the correct value when the|
|>||>How to get the most out of tables that|
|>||>How to sort out dependencies (aka|
|>||>How to use server triggers to prevent|
|> Anti-Patterns||>…future topic…|
|>||>How to recognize a pattern that appears|
|>||>Essays that move beyond the database itself, explaining|
|> Review of Various Web Architectures||>…future topic…|
|>||>The most widely read topic ever on this|
|> Connecting To The Database||>…future topic…|
|> Putting Business Logic in The Server||>…future topic…|
|>||>How to improve performance by understanding the|
|>||>Frustrations and Advantages in using the browser|
|>||>Thoughts on frameworks, upgrade and build tools, and any|
|>||>Examines a deeply nested table structure and how|
|> Build Tool||>…future topic…|
|> Deploy Tool||>…future topic…|
|>||>Earliest essays from this blog series.|
Most web-based database applications make use of email
to allow users to change their passwords. Completing
securing this operation can be tricky business, and one
of the best ways to do it is to user database
Disclaimer 1: Only As Secure as Email
We tend to take it for granted today that password reset systems
work through email. We reason that if a user can access an email
sent by us then they are who they say they are. Obviously this
will not be true if a user’s email account has been compromised.
Dealing with the possibility of compromised email accounts is
outside the scope of this week’s essay. There are other
strategies available to reduce that risk, but they will be
treated in some future essay.
Disclaimer 2: Only SSL (HTTPS) of Course!
It is not much use giving yourself a super-secure email system
if you transmit sensitive information over unencrypted
connections. Secure Socket Layers (SSL) should always be used
when high security is required. For the end-user this means
they are going to a site through HTTPS instead of HTTP.
Password Resets vs. Sending Passwords
On some low-security systems it is acceptable
to send a user his password in an email. This approach
is very ill-advised in higher security contexts because
we have no control over the user’s storage of that email.
It could end up anywhere, and anybody might read it.
When security requirements are higher, it is better
to force the user to reset their password. There are
several reasons for this, but the important one here
is that we do not want to send the actual password in
an email. Therefore we must send a link that sends them
to a page where they can provide a new password.
If we spell out the requirements for a secure password
reset system, they are at the very least these:
- We must generate some hash and send it to the user,
this is how she will identify herself so we
can let her change her password.
- The hash must expire at some point, since we cannot
be sure the user will completely purge out the
email (or that he even can, depending on the policy
of the email host).
- It must be completely impossible for anybody to read
the hash, otherwise they could intercept the
reset process and set a password for themselves.
- Despite requirement 3 just listed, we must somehow
verify the hash when the user presents it.
- We must be able to change the user’s password, which
is a priveleged operation, even though the user
is not even logged in.
It is not actually possible to implement these requirements
in application code alone (or perhaps I should say is not
possible to do it and meet minimum acceptable risk).
There are two problems if you try it:
- Requirements 3 and 4 cannot be reconciled. If the
application is able to read the hash to verify it,
then a vulnerability in the application code could
lead to compromise. If we implement in application
code we have the burden of ensuring practically
zero vulnerabilities, while if we go server-side
we have no such burden (at least for this feature).
- Requirement five requires the application code to
connect at a very high privelege level, which could
lead to completely unrelated vulnerabilities.
Implementing In The Database
The system I will now describe meets all 5 of the
requirements listed above while never requiring a
priveleged connection to the database. The feature
is implemented in an isolated system that cannot
touch other systems, and it has no burden to be
particularly careful in writing the application
Since a picture is worth a thousand words, here it is:
The process begins at the top left. The user
clicks on some “Forgot Password” link and provides
an email or account id. This goes to web server
which generates an INSERT to the insert-only
table of hashes. This insert contains only the
user’s id, nothing else is needed.
There is a trigger on the table that fires on the
INSERT. This trigger generates the hash and
sends the email to the user.
The salient features here are that the table is
insert-only, which is explained below, and that
the trigger operates at super-user level, which
is also explained below.
Once the user receives the link and clicks on it,
our process goes over to the right. The user
lands on a page and provides a new password
(and probably of course must type it in twice).
The web server does basic things like making sure
the two values match, that the password is long
enough, and like that, and then generates an
INSERT into a second table. The insert contains
the email or account ID, the hash, and the
desired new password.
The magic begins on the INSERT into the second
table. An INSERT trigger running at superuser
level is allowed to look at the first table and
verify the hash and its expiration. If these
match, it sets the user’s password.
Simple, really, IMHO.
Feature 1: Insert Only Tables
This system depends on creating tables that any
unpriveleged user can insert into, but which nobody
can SELECT from or UPDATE to or DELETE from.
This may sound like a joke: “Insert Only Table”, something
like “Write only memory”. But the idea is very simple,
if nobody can SELECT from the table then nobody can
discover active hashes. If nobody can UPDATE the table
then nobody can forge hashes. Finally, if nobody can
DELETE from the table then nobody can cause mischief.
The code for the tables looks like this:
-- FIRST TABLE CREATE TABLE users_pwrequests ( recnum_pwr integer, user_id character varying(40), md5 character(32), ts_ins timestamp without time zone, ) -- NOTE! This syntax is PostgreSQL, there may be -- slight variations on other platforms. REVOKE ALL ON TABLE users_pwrequests FROM PUBLIC; GRANT INSERT ON TABLE users_pwrequests FROM PUBLIC; -- SECOND TABLE CREATE TABLE users_pwverifies ( recnum_pwv integer, user_id character varying(40), md5 character(32), member_password character varying(20), ) REVOKE ALL ON TABLE users_pwverifies FROM PUBLIC; GRANT INSERT ON TABLE users_pwverifies FROM PUBLIC;
Feature 2: Trigger Security Priveleges
It is possible on most servers to severely limit
a user’s allowed actions on a table, but then
to provide trigger code that fires on those actions
and executes a super-user level. Today’s technique
depends upon this ability. Trigger code operating
at superuser level can look at the insert-only
table to verify a hash, and it can also set the
This basic ability is
what makes triggers so amazing and cool
for implementing business logic (see also
>Triggers and Encapsulation), because there
is no way for a user to directly invoke a
trigger for his own nefarious purposes, and there
is no way for a cracker to avoid the firing of
the trigger if he performs an action on a table.
Triggers are truly the most powerful example of
encapsulation of data and code that is available
to today’s programmer.
The first trigger looks something like this (
this is PostgreSQL code, your server will likely
require variations) (I have also stripped it
down for brevity, it may not work exactly
CREATE OR REPLACE FUNCTION users_pwrequests_ins_bef_r_f() RETURNS trigger AS $BODY$ DECLARE NotifyList text = ''; ErrorList text = ''; ErrorCount int = 0; AnyInt int; AnyRow RECORD; AnyChar varchar; AnyChar2 varchar; AnyChar3 varchar; AnyChar4 varchar; BEGIN -- necessary for an old glitch in pg security SET search_path TO public; -- Only execute if the user's id is valid SELECT INTO AnyInt Count(*) FROM users WHERE user_id = new.user_id; IF AnyInt > 0 THEN SELECT INTO AnyChar email FROM users WHERE user_id = new.user_id; -- This lets you put the email itself into -- a table for admin control SELECT INTO AnyChar2 variable_value FROM variables WHERE variable = 'PW_EMAILCONTENT'; -- Also the server is stored in a table SELECT INTO AnyChar3 variable_value FROM variables WHERE variable = 'SMTP_SERVER'; -- This becomes the email FROM Address SELECT INTO AnyChar4 variable_value FROM variables WHERE variable = 'EMAIL_FROM'; IF AnyChar4 IS NULL THEN AnyChar4 = ''; END IF; -- Very important! Set the md5 hash! new.md5 := md5(now()::varchar); -- Call out to a stored procedure that sends emails PERFORM pwmail(AnyChar ,'Password Reset Request' ,AnyChar2 || new.md5 ,AnyChar3 ,AnyChar4); EXECUTE ' ALTER ROLE ' || new.user_id || ' NOLOGIN '; END IF; -- 3000 PK/UNIQUE Insert Validation END; $BODY$ -- The "SECURITY DEFINER" is crucial, it allows -- the trigger to run as the super-user who -- created it LANGUAGE 'plpgsql' VOLATILE SECURITY DEFINER
The second trigger looks like this:
CREATE OR REPLACE FUNCTION users_pwverifies_ins_bef_r_f() RETURNS trigger AS $BODY$ DECLARE NotifyList text = ''; ErrorList text = ''; ErrorCount int = 0; AnyInt int; AnyRow RECORD; AnyChar varchar; AnyChar2 varchar; AnyChar3 varchar; AnyChar4 varchar; BEGIN SET search_path TO public; -- Read the first table to see if the -- link is valid and has not expired SELECT INTO AnyInt Count(*) FROM users_pwrequests WHERE user_id = new.user_id AND md5 = new.md5 AND age(now(),ts_ins) < '20 min'; IF AnyInt = 0 THEN ErrorCount = ErrorCount + 1; ErrorList = ErrorList || 'user_id,9005,Invalid Link;'; ELSE -- Magic! The user's password is set EXECUTE 'ALTER ROLE ' || new.user_id || ' LOGIN PASSWORD ' || quote_literal(new.member_password); -- Very important! Now that we have set it, -- erase it so it is not saved to the table new.member_password := ''; END IF; -- 3000 PK/UNIQUE Insert Validation IF ErrorCount > 0 THEN RAISE EXCEPTION '%',ErrorList; RETURN null; ELSE RETURN new; END IF; END; $BODY$ LANGUAGE 'plpgsql' VOLATILE SECURITY DEFINER;
Feature 3: Sending Email From Database Server
The technique present above requires that your
database server be able to send emails. This
is not always possible. Postgresql
can do it, and I have to believe the other big guys
can as well, but I have not tried it yet personally.
To send emails through a PostgreSQL server, you must
install Perl as an untrusted language, and then install
the Perl MAIL package. If anybody wants to know more
about that then please leave a comment and I will
expand the essay to include that.
Feature 4: The Empty Column
There is one more note that should be made. To use
this system, you must tell the server the user’s
desired new password. To do that, you must actually
make it part of the INSERT command and therefore you
must have a column for it in the 2nd read-only table.
However, you certainly do not want to actually save
it, so you have the trigger set the password
first and then blank out the value, so the final
row saved to the table does not actually contain
anything. This is noted in the code comments on
the second trigger, which is included above.
The technique presented today makes full use of
database server abilities to create a password
reset system that is highly resistant to forgery,
interception, and evil-admin meddling. It makes
use of a combination of restrictive table security,
priveleged trigger code, and sending emails from
the database server.