Java源码笔记

Java源码笔记 since 2013-27-27

ArrayList转化为数组

java.util.ArrayList提供了两个方法：

// method#1
public Object[] toArray() {
    return Arrays.copyOf(elementData, size);
}
// method#2
@SuppressWarnings("unchecked")
public <T> T[] toArray(T[] a) {
    if (a.length < size)
        // Make a new array of a's runtime type, but my contents:
        return (T[]) Arrays.copyOf(elementData, size, a.getClass());
    System.arraycopy(elementData, 0, a, 0, size);
    if (a.length > size)
        a[size] = null;
    return a;
}
// referenced method by method#2
public static <T,U> T[] copyOf(U[] original, int newLength, Class<? extends T[]> newType) {
    // 构造一个对应数据类型的新数组
    T[] copy = ((Object)newType == (Object)Object[].class)
        ? (T[]) new Object[newLength]
        : (T[]) Array.newInstance(newType.getComponentType(), newLength);
    // 拷贝数据
    System.arraycopy(original, 0, copy, 0,
                     Math.min(original.length, newLength));
    return copy;
}

// 测试代码
ArrayList<Integer> al = new ArrayList<Integer>();
al.add(1);al.add(2);
Integer[] arr = new Integer[al.size()];
// arr = (Integer[])al.toArray(); // Java不能一次性强制转化数组中每个元素类型
al.toArray(arr); //这是Java提供的解决方法

equals() vs hashCode()

从设计原则角度看，equal()方法要‘强于’hashCode()方法,即：equal相同，hashCode肯定相同，反之不然
最好使用immutable，final的域计算对象的hashCode和equal，如String, Integer等都是比较好的选择,这样线程安全很多(当然如果多线程环境下，还是选择ConcurrentHashMap或HashTable吧)
如果是immutable的,Java可以缓存这些对象的hashCode
如果对象的状态变了，hashCode会跟着变，那么将无法在HashMap中找到==!
如果对象要作为HashMap或HashSet的key的话，最好是immutable的，否则要注意是否需要重写equals()和hashcode()两个方法

HashMap HashSet

HashSet 是基于HashMap的，即HashSet是值都是private static final Object PRESENT = new Object();的HashMap

HashMap中迭代器每次都要遍历table找下一个key,但考虑装载因子大概0.75，所以遍历的空值不会太多而影响效率

final Entry<K,V> nextEntry() {
  if (modCount != expectedModCount)
      throw new ConcurrentModificationException();
  Entry<K,V> e = next;
  if (e == null)
      throw new NoSuchElementException();
  
  //先判断冲突链上是否有后续元素
  if ((next = e.next) == null) {
      Entry[] t = table;
      // 遍历table，找到下一个key
      while (index < t.length && (next = t[index++]) == null)
          ;
  }
  current = e;
  return e;
}

HashMap使用chain的方式解决冲突

插入基本流程: 首先根据key的hash找到要插入的bucket，如果bucket不空，使用equal方法对比chain中的是否包含要插入的key，如果包含，更新key对应的value，并将老的value返回；bucket为空，直接插入;
查找流程: 根据查找key的hash找到对应的bucket，之后使用key的equal方法比较是否存在

// MUST be a power of two <= 1«30. static final int DEFAULT_INITIAL_CAPACITY = 16; static final int MAXIMUM_CAPACITY = 1 « 30;

static final float DEFAULT_LOAD_FACTOR = 0.75f;

//构造函数

public HashMap(int initialCapacity, float loadFactor) { if (initialCapacity < 0) throw new IllegalArgumentException(“Illegal initial capacity: “ + initialCapacity); if (initialCapacity > MAXIMUM_CAPACITY) initialCapacity = MAXIMUM_CAPACITY; if (loadFactor <= 0 || Float.isNaN(loadFactor)) throw new IllegalArgumentException(“Illegal load factor: “ + loadFactor);
```
  // Find ***a power of 2*** >= initialCapacity
  int capacity = 1;
  while (capacity < initialCapacity)
      capacity <<= 1;

  this.loadFactor = loadFactor;
  threshold = (int)(capacity * loadFactor);
  table = new Entry[capacity];
  init();   }
```
public V put(K key, V value) { if (key == null) return putForNullKey(value); int hash = hash(key.hashCode()); // as below int i = indexFor(hash, table.length); // as below for (Entry<K,V> e = table[i]; e != null; e = e.next) { Object k; // 如果key存在，更新key对应的value if (e.hash == hash && ((k = e.key) == key || key.equals(k))) { V oldValue = e.value; e.value = value; e.recordAccess(this); return oldValue; } } modCount++; // key不存在，添加key-value，链式扩展解决冲突 addEntry(hash, key, value, i); return null; }

//使用链式解决冲突 void addEntry(int hash, K key, V value, int bucketIndex) { Entry<K,V> e = table[bucketIndex]; table[bucketIndex] = new Entry<>(hash, key, value, e); //保存hash一方面是扩容重hash的时候用 if (size++ >= threshold) resize(2 * table.length); //保证length是2的幂指数 }

//这个方法要注意了，取余操作h%length写成位操作 //主要是因为HashMap的扩容机制保证length是2的幂指数 static int indexFor(int h, int length) { return h & (length-1); }

/** * Applies a supplemental hash function to a given hashCode, which * defends against poor quality hash functions. This is critical * because HashMap uses power-of-two length hash tables, that * otherwise encounter collisions for hashCodes that do not differ * in lower bits. Note: Null keys always map to hash 0, thus index 0. */ static int hash(int h) { // This function ensures that hashCodes that differ only by // constant multiples at each bit position have a bounded // number of collisions (approximately 8 at default load factor). h ^= (h »> 20) ^ (h »> 12); return h ^ (h »> 7) ^ (h »> 4); }

java.lang.String.hashCode()

public int hashCode() {
    int h = hash;
    if (h == 0 && count > 0) {
        int off = offset;
        char val[] = value;
        int len = count;

        for (int i = 0; i < len; i++) {
            h = 31*h + val[off++];
        }
        hash = h;
    }
    return h;
}

IO流同步问题 TODO

InputStream/OutputStream 抽象类，read，write方法都没同步
FileInputStream/FileOuputStream read/write等方法都是native的

序列化

transient: The transient keyword in Java is used to indicate that a field should not be serialized.

ArrayList序列化，存放数据的数组被声明为transient的，之后重写了序列化规则(writeObject(), readObject())，至于为啥要声明为transient，可能是对开发者的特别提示吧

public class ArrayList<E> extends AbstractList<E>
        implements List<E>, RandomAccess, Cloneable, java.io.Serializable
{
    private static final long serialVersionUID = 8683452581122892189L;

    /**
     * The array buffer into which the elements of the ArrayList are stored.
     * The capacity of the ArrayList is the length of this array buffer.
     */
    private transient Object[] elementData;

    ....

StringBuffer扩容

代码如下，很简洁，其他容器类扩容方式不尽相同，比如HashMap等哈希容器，需要保证长度是2的幂指数

public AbstractStringBuilder append(String str) { 
    if (str == null) str = "null";
    int len = str.length();
    ensureCapacityInternal(count + len);
    str.getChars(0, len, value, count);
    count += len;
    return this;
}

private void ensureCapacityInternal(int minimumCapacity) {
    // overflow-conscious code
    if (minimumCapacity - value.length > 0)
        expandCapacity(minimumCapacity);
}

void expandCapacity(int minimumCapacity) {
    int newCapacity = value.length * 2 + 2;
    if (newCapacity - minimumCapacity < 0)
        newCapacity = minimumCapacity;
    if (newCapacity < 0) {
        if (minimumCapacity < 0) // overflow
            throw new OutOfMemoryError();
        newCapacity = Integer.MAX_VALUE;
    }
    value = Arrays.copyOf(value, newCapacity);
}

Java Regex AppendReplacement

Matcher.quoteReplacement(String replacement): literalize replacement, slash(‘/’) or dollar(‘$’) with no special meanings.

public static String updateCharsetValue(String file, Charset charset) {
    Matcher match = GenericCharsetPatt.matcher(file);
    StringBuffer nHtml = new StringBuffer();
    String pre, post;
    while (match.find()) {
        pre = match.group(1);
        post = match.group(3);
        match.appendReplacement(nHtml, Matcher.quoteReplacement(pre
            + charset.displayName() + post));
    }
    match.appendTail(nHtml);
    return nHtml.toString();
}

java.lang.Object

好多方法都是native的，即调用的虚拟机的底层语言，一般为c++
clone方法是一个‘浅拷贝’，我的理解是对象中的所有域做了‘值拷贝’：
1. 对象中的域分三种：primitive，不变对象(final的，如字符串)，普通对象
2. 对于primitive和不变对象进行‘值’拷贝是没有任何问题的，因为它们是不会变的
3. 对于普通对象的‘值’拷贝，相当于只拷贝了引用，但指向的都是同一个实际对象