locality

  • temporal locality
  • spatial locality

128 byte cacheline: chunk of data transferred between the cores' cache -> associated with specific core at specific time -> 'move' the access authority to other cores

problem: accessing multiple location in the cacheline requires exclusive access to the cacheline e.g. core1 accessing l1, core2 accessing l2 (different part of the data) -> trying to access l's requires exlusive access to the cacheline false sharing: not fshared in the code, but underlying hardward is actually sharing the data

protogen(isca): https://dl.acm.org/doi/10.1109/ISCA.2018.00030

crossbeam_utils::CachePadded
//-> padding bytes for concurrent programming, size of cachelines
crossbeam_utils::channel
//-> padding bytes for concurrent programming, size of cachelines
  • sending/receiving values among thread
  • message passing model / actor model
  • communicate with each other
rayon::into_par_iter(self)- > self::Iter

parallele iterator will be returned, multiple threads will be executed with the datas in the iter.

Implementations

Spinlock

  • AtomicBool value + multiple shared references to the bool value -> tryp to acquire the lock
    • compare and swap(&inner, 0, 1): CAS operation -> atomically
      • false( 0): not acquired by other thread (sable)
      • True(1): acquired by some thread
impl RawLock for SpinLock {
    type Token = ();

    fn lock(&self) { 
        let backoff = Backoff::new();

        while self
            .inner
            .compare_exchange(false, true, Ordering::Acquire, Ordering::Relaxed) // cas
            .is_err()
        {
            backoff.snooze();
        }
    }

    // ensured that nobody has changed the value of the lock (by cas)
    // only called after the guarantee of lock has been acquired
    unsafe fn unlock(&self, _token: ()) {
        self.inner.store(false, Ordering::Release);
    }
}
``

> problem of 

old=inner if old=0 then inner=1

old1=0 if old1=0 inner=1;
old2=0 if old2=0 inner=1;  // multiple thread accessing the data and 

> Dekker's algorithm: slow!!!