基础数据结构·数组

晴天和猫2023-06-022023-06-07

基础数据结构

数组

概述

定义

在计算机科学中，数组是由一组元素（值或变量）组成的数据结构，每个元素有至少一个索引或键来标识

In computer science, an array is a data structure consisting of a collection of elements (values or variables), each identified by at least one array index or key

因为数组内的元素是连续存储的，所以数组中元素的地址，可以通过其索引计算出来，例如：

1	int[] array = {1,2,3,4,5}

知道了数组的数据起始地址 $BaseAddress$，就可以由公式 $BaseAddress + i * size$ 计算出索引 $i$ 元素的地址

$i$ 即索引，在 Java、C 等语言都是从 0 开始
$size$ 是每个元素占用字节，例如 $int$ 占 $4$，$double$ 占 $8$

小测试

1	byte[] array = {1,2,3,4,5}

已知 array 的数据的起始地址是 0x7138f94c8，那么元素 3 的地址是什么？

答：0x7138f94c8 + 2 * 1 = 0x7138f94ca

空间占用

Java 中数组结构为

8 字节 markword
4 字节 class 指针（压缩 class 指针的情况）
4 字节数组大小（决定了数组最大容量是 $2^{32}$）
数组元素 + 对齐字节（java 中所有对象大小都是 8 字节的整数倍[^12]，不足的要用对齐字节补足）

例如

1	int[] array = {1, 2, 3, 4, 5};

的大小为 40 个字节，组成如下

1	8 + 4 + 4 + 5*4 + 4(alignment)

随机访问性能

即根据索引查找元素，时间复杂度是 $O(1)$

动态数组

java 版本

public class DynamicArray implements Iterable<Integer> {
    private int size = 0; // 逻辑大小
    private int capacity = 8; // 容量
    private int[] array = {};


    /**
     * 向最后位置 [size] 添加元素
     *
     * @param element 待添加元素
     */
    public void addLast(int element) {
        add(size, element);
    }

    /**
     * 向 [0 .. size] 位置添加元素
     *
     * @param index   索引位置
     * @param element 待添加元素
     */
    public void add(int index, int element) {
        checkAndGrow();

        // 添加逻辑
        if (index >= 0 && index < size) {
            // 向后挪动, 空出待插入位置
            System.arraycopy(array, index,
                    array, index + 1, size - index);
        }
        array[index] = element;
        size++;
    }

    private void checkAndGrow() {
        // 容量检查
        if (size == 0) {
            array = new int[capacity];
        } else if (size == capacity) {
            // 进行扩容, 1.5 1.618 2
            capacity += capacity >> 1;
            int[] newArray = new int[capacity];
            System.arraycopy(array, 0,
                    newArray, 0, size);
            array = newArray;
        }
    }

    /**
     * 从 [0 .. size) 范围删除元素
     *
     * @param index 索引位置
     * @return 被删除元素
     */
    public int remove(int index) { // [0..size)
        int removed = array[index];
        if (index < size - 1) {
            // 向前挪动
            System.arraycopy(array, index + 1,
                    array, index, size - index - 1);
        }
        size--;
        return removed;
    }


    /**
     * 查询元素
     *
     * @param index 索引位置, 在 [0..size) 区间内
     * @return 该索引位置的元素
     */
    public int get(int index) {
        return array[index];
    }

    /**
     * 遍历方法1
     *
     * @param consumer 遍历要执行的操作, 入参: 每个元素
     */
    public void foreach(Consumer<Integer> consumer) {
        for (int i = 0; i < size; i++) {
            // 提供 array[i]
            // 返回 void
            consumer.accept(array[i]);
        }
    }

    /**
     * 遍历方法2 - 迭代器遍历
     */
    @Override
    public Iterator<Integer> iterator() {
        return new Iterator<Integer>() {
            int i = 0;

            @Override
            public boolean hasNext() { // 有没有下一个元素
                return i < size;
            }

            @Override
            public Integer next() { // 返回当前元素,并移动到下一个元素
                return array[i++];
            }
        };
    }

    /**
     * 遍历方法3 - stream 遍历
     *
     * @return stream 流
     */
    public IntStream stream() {
        return IntStream.of(Arrays.copyOfRange(array, 0, size));
    }
}

这些方法实现，都简化了 index 的有效性判断，假设输入的 index 都是合法的

插入或删除性能

头部位置，时间复杂度是 $O(n)$

中间位置，时间复杂度是 $O(n)$

尾部位置，时间复杂度是 $O(1)$（均摊来说）