/article/discussion/arrayIndexes.php

# Array Indexes Must Start At 1, Otherwise Cripled

- historical development
- mathematical extensions
- and ergonomics.
- I will also consider argumentation of others.

## Performance difference

Even the low-level assembler is a compiled language - which neutralizes the dilemma completely because the
compiler can take care of the translation. If a language really should allow tackling problems relying on
such
micro-optimizations, syntactical constructs like `[`

and _{0} ]`[`

can be used to address this.
_{1}
]

## Extendability of concept

### Accessing items from the end of the array

If we begin array with a `0`

, we have no choice, but to take `-1`

as the index of
first item from the end. Thinking in 0-based paradigm from the start and in 1-based paradigm from the
end -
that requires to be skilled in thinking in both. To start indexing with the `1`

, allows a
simpler, symmetrical extension from domain of unsigned to the domain of signed numbers.

`+0`

and `-0`

do exist in certain types of numerical encodings^{1}. This representation exists at the level of source code, but there is no concept which would capture this distinction at runtime.

And so 0-based arrays could hold the line, somehow^{3}, yet the extensibility concept does not
end
there. *An array does not consist all just of neatly ordered items.*

### Addressing space between items

Inserting an item between some other two is exactly that, addressing a gap. Usually, inserting is
treated as a special operation, where the index passed as parameter, talks about inserting
*before* the item of that index^{2}. Yet there are two different semantics: inserting
before and after.

The same goes for addressing slices. Does the starting index include the element, is the ending index "up to, not including"? Now there are four options, when one has no choice but to stick to integers. Floating points offer a solution.

The concept of 1-based array can straightforwardly encompass this - defining an item to be placed at
positions `i+0.5`

or `i-0.5`

is just that. The extension, is from the domain of
signed numbers to the domain of floating-point numbers.

The 0-based concept breaks down here. In it's native, the positive domain, inserting item
before `0`

, would mean to insert it at `-0.5`

. Worse
than counterintuitive, collision occurs, when inserting item in the negative domain after the end, at
the index `-1 + 0.5`

.

### Further numerical extensions

One basedness allows `0`

to have special meaning, but also `+Inf`

,
`-Inf`

and `NaN`

can be difined to have a meaning. Languages might want to begin supporting
`+0`

and `-0`

floating point expressions.

^{1}Wikipedia entry of signed zero.

^{2}There is no other choice in 0-baseness, explained in a later paragraph.

^{3}As well, IEEE floating-point supports

`+0`

and `-0`

.
## Notation Comparism

Here are all different notations of iteration upon zero- and one-based arrays. I only list iterating through the positive indexes, as it is always simpler and using the negative domain does not lead to any advantages.

There are always two alternatives on how to notate an iteration: having a sharp and unsharp comparism. I included both for completeness, just note that the upper one of the pairs is the better/usual way to notate the range.

### form

### problems

### special chars variation

(i=0; i<n; i++)

1

in 1 cluster

and 1 logical group

(i=0; i<=n-1; i++)

3

in 2 clusters

and 2 logical groups

(i=n-1; i>=0; i--)

3

in 2 clusters

and 2 logical groups

(i=n-1; i>-1; i--)

more groups than clusters

invalid index: -1

3

in 2 clusters

and 3 logical groups

(i=1; i<=n; i++)

2

in 1 cluster

and 1 logical group

(i=1; i<n+1; i++)

visually cumbersome: n+1

2

in 2 clusters

and 2 logical groups

(i=n; i>=1; i--)

2

in 1 cluster

and 1 logical group

(i=n; i>0; i--)

1

in 1 cluster

and 1 logical group

1-basedness iterations are beautifully symmetrical, using the same unsharp compare operator (larger-equal /
smaller-equal). This also shows how practical is, when the upper index is the same as element count. Imagine
every `n`

would be expressed as `count(arr)`

.

Visual simplicity makes reading and understanding simpler, especially boundaries which are in-sync with how they are expressed. This aids natural perception and direct expectancy fulfilment.

## v.s.

### Dijkstra

I found it was a rather quickly written note.

Dijkstra rooted himself in the first section about intervals, didn't put the two variations against each other anymore in the later parts and thus had no choice but arriving at 0-based index as the more natural one.

We sail apart right at the first argument, because having constants readable on the monitor is what I consider as expected, instead of thinking in length of runs. There are two subtractions which give correct length (expanding either the lower side or the upper side of the interval). Yet visually in-sync, is only one variation. The empirical evidence is 35 years old without a reference to inspect it.

My argumentation worked itself through multiple areas independently, and in each section found 1-based array to be a better variation in all areas.

### Others

Most discussion pro 0-baseness contains reference to Dijkstra and simple arguments without analysis (like a naive argument on obvious performance).

1-based arguments often contain references to mathematical notation and pedagogical reasons reasoned by expectations and simplicity.

## Conclusion

Well defined concepts which provide sound extensions - are in mathematics and it's applied areas, like programming, what it is all about. To me, 0-based arrays seem like a missed decision on decoupling of assembler language from binary code - which was bound on the hardware and it's structure by necessity. We are struggling with that ever since.

## Links

Wiki of Cunningham & Cunningham: Zero And One Based Indexes

Wikipedia: Zero based numbering