datastructures Archives - ProdSens.live https://prodsens.live/tag/datastructures/ News for Project Managers - PMI Sun, 02 Jun 2024 17:20:19 +0000 en-US hourly 1 https://wordpress.org/?v=6.5.5 https://prodsens.live/wp-content/uploads/2022/09/prod.png datastructures Archives - ProdSens.live https://prodsens.live/tag/datastructures/ 32 32 Stacks, Data Structures https://prodsens.live/2024/06/02/stacks-data-structures/?utm_source=rss&utm_medium=rss&utm_campaign=stacks-data-structures https://prodsens.live/2024/06/02/stacks-data-structures/#respond Sun, 02 Jun 2024 17:20:19 +0000 https://prodsens.live/2024/06/02/stacks-data-structures/ stacks,-data-structures

Stacks A stack is a fundamental data structure in computer science that operates on a Last In, First…

The post Stacks, Data Structures appeared first on ProdSens.live.

]]>
stacks,-data-structures

Stacks

A stack is a fundamental data structure in computer science that operates on a Last In, First Out (LIFO) principle. This means that the last element added to the stack is the first one to be removed. Stacks are analogous to a pile of plates where you can only add or remove the top plate. This simplicity and the constraint on how elements are added and removed make stacks particularly useful for certain types of problems and algorithms.

Basic Concepts of Stacks

  1. Push Operation: This operation adds an element to the top of the stack. If the stack is implemented using an array and the array is full, a stack overflow error may occur.

  2. Pop Operation: This operation removes the top element from the stack. If the stack is empty, a stack underflow error may occur.

  3. Peek (or Top) Operation: This operation returns the top element of the stack without removing it. It is useful for accessing the top element without modifying the stack.

  4. isEmpty Operation: This operation checks whether the stack is empty. It returns true if the stack has no elements, otherwise false.

  5. Size Operation: This operation returns the number of elements currently in the stack.

Stacks are widely used in various applications such as parsing expressions, backtracking algorithms, function call management in programming languages, and many others. Understanding the basic operations and concepts of stacks is essential for solving problems that involve this data structure.

Characteristics of Stacks

  1. LIFO Structure: The defining characteristic of a stack is its Last In, First Out (LIFO) nature. This means that the most recently added element is the first one to be removed. This characteristic is crucial for scenarios where the most recent items need to be processed first.

  2. Operations are Performed at One End: All operations (push, pop, peek) are performed at the top end of the stack. This makes the stack operations very efficient in terms of time complexity, typically O(1) for these operations.

  3. Limited Access: In a stack, elements are only accessible from the top. This restricted access is what differentiates a stack from other data structures like arrays or linked lists, where elements can be accessed at any position.

  4. Dynamic Nature: When implemented using linked lists, stacks can grow and shrink dynamically as elements are added or removed. This flexibility allows stacks to handle varying sizes of data efficiently.

Understanding these characteristics helps in leveraging stacks effectively for various computational problems and in recognizing situations where a stack is the appropriate data structure to use.

Implementing Stacks

Implementing stacks can be done in several ways, with the most common methods being through arrays and linked lists. Each implementation has its own advantages and limitations, making them suitable for different scenarios.

1. Array-Based Implementation

In an array-based stack, a fixed-size array is used to store the stack elements. An index (often called the ‘top’ index) keeps track of the position of the last element added.

  • Advantages:

    • Simple to implement.
    • Provides fast access to elements (O(1) for push and pop operations).
    • Memory is contiguous, leading to better cache performance.
  • Limitations:

    • Fixed size, which means the stack can overflow if it exceeds the array’s capacity.
    • Resizing the array (to handle overflow) can be time-consuming and memory-intensive.

2. Linked List-Based Implementation

In a linked list-based stack, each element is stored in a node, with each node pointing to the next node in the stack. The top of the stack is represented by the head of the linked list.

  • Advantages:

    • Dynamic size, which means it can grow and shrink as needed without worrying about overflow (as long as memory is available).
    • No need to predefine the stack size.
  • Limitations:

    • Slightly more complex to implement compared to array-based stacks.
    • Generally has higher memory overhead due to storing pointers/references.
    • Access time might be slower due to non-contiguous memory allocation.

By understanding these different methods of implementing stacks, you can choose the one that best fits the requirements of your application, balancing between simplicity, performance, and flexibility.

Stack Implementation Using Arrays in C++

Implementing a stack using arrays in C++ is a straightforward approach that leverages the array’s contiguous memory allocation, which allows for efficient access and manipulation of elements. In this method, we use a fixed-size array to hold the stack elements and manage the stack operations using simple indexing.

Creation of the Stack

To implement a stack using arrays in C++, we first define a class Stack with its basic attributes and a constructor. Below is a part of the code for the Stack class with its fundamental attributes and constructor:

#include 
using namespace std;

class Stack {
private:
    int* arr;
    int top;
    int capacity;

public:
    // Constructor to initialize stack
    Stack(int size) {
        arr = new int[size];
        capacity = size;
        top = -1;
    }

    // Destructor to free memory allocated to the array
    ~Stack() {
        delete[] arr;
    }
};

Attributes Explanation:

  1. arr: This is a pointer to an integer array that will store the elements of the stack. The array is dynamically allocated based on the capacity provided during the stack’s initialization.

  2. top: This integer variable keeps track of the index of the top element in the stack. Initially, it is set to -1, indicating that the stack is empty.

  3. capacity: This integer variable defines the maximum number of elements that the stack can hold. It is set when the stack is initialized and does not change during the stack’s lifetime.

Constructor Explanation:

The constructor Stack(int size) initializes the stack with a specified capacity:

  • arr = new int[size]: Allocates memory for the stack’s array based on the given size.
  • capacity = size: Sets the capacity of the stack.
  • top = -1: Initializes the top index to -1, indicating that the stack is currently empty.

This setup provides the basic framework for the stack, allowing us to build upon it with the necessary operations such as push, pop, and peek. The constructor ensures that the stack is properly initialized with the specified capacity and is ready for use.

Operations on Stacks (Array Implementation)

To effectively use a stack, we need to implement several fundamental operations. These include pushing an element onto the stack, popping an element from the stack, peeking at the top element, checking if the stack is empty, and checking if the stack is full. Each of these operations can be efficiently implemented using arrays.

Push Operation

The push operation adds an element to the top of the stack. Before adding, it checks if the stack is full to avoid overflow. If the stack is full, an error message is displayed; otherwise, the element is added, and the top index is incremented.

void push(int x) {
    if (isFull()) {
        cout << "Overflow: Stack is full.n";
        return;
    }
    arr[++top] = x;
}

Time Complexity: O(1)

Pop Operation

The pop operation removes the top element from the stack. Before removing, it checks if the stack is empty to avoid underflow. If the stack is empty, an error message is displayed; otherwise, the element is removed, and the top index is decremented.

int pop() {
    if (isEmpty()) {
        cout << "Underflow: Stack is empty.n";
        return -1;
    }
    return arr[top--];
}

Time Complexity: O(1)

Peek Operation

The peek operation returns the top element of the stack without removing it. It checks if the stack is empty before accessing the top element.

int peek() {
    if (!isEmpty()) {
        return arr[top];
    } else {
        cout << "Stack is empty.n";
        return -1;
    }
}

Time Complexity: O(1)

isEmpty Operation

The isEmpty operation checks whether the stack is empty by verifying if the top index is -1.

bool isEmpty() {
    return top == -1;
}

Time Complexity: O(1)

isFull Operation

The isFull operation checks whether the stack is full by comparing the top index with the maximum capacity minus one.

bool isFull() {
    return top == capacity - 1;
}

Time Complexity: O(1)

By implementing these operations, we ensure that the stack can be efficiently used for its intended purposes, such as managing data in a LIFO order. Each operation is designed to run in constant time, ensuring quick and predictable performance.

Full Code Implementation of Stacks Using Arrays

Below is the full implementation of a stack using arrays in C++. This implementation encapsulates the stack operations within a class, providing a clean and efficient way to manage stack data.

#include 
using namespace std;

class Stack {
private:
    int* arr;
    int top;
    int capacity;

public:
    // Constructor to initialize stack
    Stack(int size) {
        arr = new int[size];
        capacity = size;
        top = -1;
    }

    // Destructor to free memory allocated to the array
    ~Stack() {
        delete[] arr;
    }

    // Utility function to add an element `x` to the stack
    void push(int x) {
        if (isFull()) {
            cout << "Overflow: Stack is full.n";
            return;
        }
        arr[++top] = x;
    }

    // Utility function to pop the top element from the stack
    int pop() {
        if (isEmpty()) {
            cout << "Underflow: Stack is empty.n";
            return -1;
        }
        return arr[top--];
    }

    // Utility function to return the top element of the stack
    int peek() {
        if (!isEmpty()) {
            return arr[top];
        } else {
            cout << "Stack is empty.n";
            return -1;
        }
    }

    // Utility function to check if the stack is empty
    bool isEmpty() {
        return top == -1;
    }

    // Utility function to check if the stack is full
    bool isFull() {
        return top == capacity - 1;
    }

    // Utility function to return the size of the stack
    int size() {
        return top + 1;
    }
};

int main() {
    Stack stack(3);

    stack.push(1);
    stack.push(2);
    stack.push(3);

    cout << "Top element is: " << stack.peek() << endl;

    cout << "Stack size is " << stack.size() << endl;

    stack.pop();
    stack.pop();
    stack.pop();

    if (stack.isEmpty()) {
        cout << "Stack is emptyn";
    } else {
        cout << "Stack is not emptyn";
    }

    return 0;
}

Stack Implementation Using Linked List in C++

Implementing a stack using a linked list in C++ allows for dynamic memory allocation, enabling the stack to grow and shrink as needed. In this method, each element of the stack is represented as a node in the linked list, with each node containing the data and a pointer to the next node.

Creation of the Stack

To implement a stack using a linked list in C++, we encapsulate the stack within a class. Below is a class-based implementation of a stack using a linked list, including its attributes and constructor.

#include 
using namespace std;

class Stack {
private:
    struct Node {
        int data;
        Node* next;
        Node(int val) : data(val), next(nullptr) {}
    };

    Node* top;

public:
    // Constructor to initialize stack
    Stack() : top(nullptr) {}

    // Destructor to free memory allocated to the linked list
    ~Stack() {
        while (!isEmpty()) {
            pop();
        }
    }
};

Attributes Explanation:

  1. top: This is a pointer to the top node of the stack, representing the last element pushed onto the stack.

Constructor Explanation:

The constructor Stack() initializes the stack by setting the top pointer to nullptr, indicating an empty stack.

  • top = nullptr: Initializes the top pointer to nullptr, indicating that the stack is empty.

This setup provides the basic framework for the stack, allowing us to build upon it with the necessary operations such as push, pop, and peek. The constructor ensures that the stack is properly initialized and ready for use.

Operations on Stacks (Linked List Implementation)

Implementing stack operations using a linked list allows for dynamic memory allocation and efficient manipulation of elements. Below are the fundamental operations of a stack – push, pop, peek, and isEmpty – along with their corresponding implementations using a linked list.

Push Operation

The push operation adds an element to the top of the stack. This operation involves creating a new node and updating the top pointer to point to this new node.

// Utility function to add an element `x` to the stack
void push(int x) {
    Node* newNode = new Node(x);
    newNode->next = top;
    top = newNode;
}

Time Complexity: O(1)

Pop Operation

The pop operation removes the top element from the stack. This operation involves updating the top pointer to point to the next node and deleting the removed node.

// Utility function to pop the top element from the stack
int pop() {
    if (isEmpty()) {
        cout << "Underflow: Stack is empty.n";
        return -1;
    }
    Node* temp = top;
    int poppedValue = temp->data;
    top = top->next;
    delete temp;
    return poppedValue;
}

Time Complexity: O(1)

Peek Operation

The peek operation returns the top element of the stack without removing it. This operation involves accessing the data of the top node.

// Utility function to return the top element of the stack
int peek() {
    if (isEmpty()) {
        cout << "Stack is empty.n";
        return -1;
    }
    return top->data;
}

Time Complexity: O(1)

isEmpty Operation

The isEmpty operation checks whether the stack is empty by verifying if the top pointer is nullptr.

// Utility function to check if the stack is empty
bool isEmpty() {
    return top == nullptr;
}

Time Complexity: O(1)

In a linked list implementation of a stack, there is typically no need for an isFull operation. This is because a linked list-based stack can theoretically grow to utilize all available memory, as long as the system has memory available for allocation.

By implementing these operations, we ensure that the stack can be efficiently used for its intended purposes, such as managing data in a Last-In-First-Out (LIFO) order. Each operation is designed to run in constant time, ensuring quick and predictable performance.

Full Code Implementation of Stacks Using Linked List

Implementing a stack using a linked list in C++ allows for dynamic memory allocation, enabling the stack to grow and shrink as needed. In this method, each element of the stack is represented as a node in the linked list, providing flexibility in managing stack operations efficiently.

#include 
using namespace std;

class Stack {
private:
    struct Node {
        int data;
        Node* next;
        Node(int val) : data(val), next(nullptr) {}
    };

    Node* top;

public:
    // Constructor to initialize stack
    Stack() : top(nullptr) {}

    // Destructor to free memory allocated to the linked list
    ~Stack() {
        while (!isEmpty()) {
            pop();
        }
    }

    // Utility function to add an element `x` to the stack
    void push(int x) {
        Node* newNode = new Node(x);
        newNode->next = top;
        top = newNode;
    }

    // Utility function to pop the top element from the stack
    int pop() {
        if (isEmpty()) {
            cout << "Underflow: Stack is empty.n";
            return -1;
        }
        Node* temp = top;
        int poppedValue = temp->data;
        top = top->next;
        delete temp;
        return poppedValue;
    }

    // Utility function to return the top element of the stack
    int peek() {
        if (isEmpty()) {
            cout << "Stack is empty.n";
            return -1;
        }
        return top->data;
    }

    // Utility function to check if the stack is empty
    bool isEmpty() {
        return top == nullptr;
    }
};

int main() {
    Stack stack;

    stack.push(1);
    stack.push(2);
    stack.push(3);

    cout << "Top element is: " << stack.peek() << endl;

    cout << "Popping elements from the stack:n";
    cout << stack.pop() << " ";
    cout << stack.pop() << " ";
    cout << stack.pop() << endl;

    if (stack.isEmpty()) {
        cout << "Stack is emptyn";
    } else {
        cout << "Stack is not emptyn";
    }

    return 0;
}

This implementation provides a complete and efficient stack data structure using a linked list, encapsulated within a class in C++. The main function demonstrates the usage of the stack class by performing various operations such as pushing, popping, peeking, and checking for emptiness.

The post Stacks, Data Structures appeared first on ProdSens.live.

]]>
https://prodsens.live/2024/06/02/stacks-data-structures/feed/ 0
Primitive Data Structure https://prodsens.live/2024/05/09/primitive-data-structure/?utm_source=rss&utm_medium=rss&utm_campaign=primitive-data-structure https://prodsens.live/2024/05/09/primitive-data-structure/#respond Thu, 09 May 2024 04:20:09 +0000 https://prodsens.live/2024/05/09/primitive-data-structure/ primitive-data-structure

What is Primitive Data Structure 🌲 Primitive data structures are basic and fundamental data structures provided by programming…

The post Primitive Data Structure appeared first on ProdSens.live.

]]>
primitive-data-structure

What is Primitive Data Structure 🌲

Primitive data structures are basic and fundamental data structures provided by programming languages. They are directly operated upon by machine instructions and are typically built into the programming language itself. Some common examples of primitive data structures include:

1.Integer: Used to store whole numbers.
2. Float/Double: Used to store floating-point numbers with decimal points.
3.Character: Used to store individual characters.
4.Boolean: Used to store true or false values.

Primitive data structures are simple and have fixed sizes, making them efficient in terms of memory usage and execution speed. They are usually atomic in nature, meaning they cannot be further subdivided.

The post Primitive Data Structure appeared first on ProdSens.live.

]]>
https://prodsens.live/2024/05/09/primitive-data-structure/feed/ 0
Data sorting https://prodsens.live/2024/04/12/data-sorting/?utm_source=rss&utm_medium=rss&utm_campaign=data-sorting https://prodsens.live/2024/04/12/data-sorting/#respond Fri, 12 Apr 2024 23:20:51 +0000 https://prodsens.live/2024/04/12/data-sorting/ data-sorting

Sorting data involves putting it in a certain order. It is a basic data manipulation technique that is…

The post Data sorting appeared first on ProdSens.live.

]]>
data-sorting

Sorting data involves putting it in a certain order. It is a basic data manipulation technique that is applied to a wide range of data analysis and visualization jobs.
Depending on the type of data and the desired outcome, there are several ways for sorting data.

Typical sorting techniques include:
● Ascending order: This is the most common sorting method, and it arranges data from
smallest to largest.
● Descending order: This method arranges data from largest to smallest.
● Custom order: This method allows you to specify the order in which data is sorted.

Data sorting in programming is the process of arranging data in a specific order. This can be done in ascending or descending order and can be based on any criteria, such as the value of a variable, the date an event occurred, or the alphabetical order of a word.
There are several sorting algorithms, each with unique advantages and disadvantages. Some of the most common sorting algorithms include:
● Bubble sort
● Selection sort
● Insertion sort

Depending on the particular requirements of the application, a sorting algorithm should be chosen. For example, if the data is relatively small, then a simple algorithm like bubble sort may be sufficient. However, if the data is large or if speed is critical, then a more efficient algorithm like merge sort or quicksort should be used.

Benefits of data sorting:
● Improved data organization: Sorting data can help to improve its organization and make
it easier to find and use.
● Increased efficiency: Sorting data can help to improve the efficiency of data processing
and analysis.
● Improved accuracy: Sorting data can help to improve the accuracy of data analysis and
reporting.
● Enhanced data visualization: Sorting data can help to enhance the visualization of data,
making it easier to understand and interpret.

Bubble Sort

Bubble sort is a simple sorting algorithm that repeatedly compares adjacent elements in an array and swaps them if they are in the wrong order.When the first two entries in the array are compared repeatedly, the algorithm swaps them if they are in the wrong order. It then compares the next two elements, and so on. This process continues until the end of the array is reached. Bubble sort is a very simple algorithm to understand and implement, but it is not very efficient. The worst-case time complexity of the technique is O(n2) means, where n is the number of items in the array and means the amount of time it takes to finish the operation. This indicates that the algorithm’s execution time is inversely related to the square of the array’s element count.

Bubble sort is not a good choice for sorting large arrays.However, it can be a good choice for sorting small arrays or arrays that are already partially sorted.

Image description

public class BubbleSort {
   public static void bubbleSort(int[] array) {
       int n = array.length;
          for (int i = 0; i < n - 1; i++) {
             for (int j = 0; j < n - i - 1; j++) {
                 if (array[j] > array[j + 1]) {
                    int temp = array[j];
                    array[j] = array[j + 1];
                    array[j + 1] = temp;
                 }
             }
          }
     }

   public static void main(String[] args) {
      int[] array = {5, 3, 1, 2, 4};
      bubbleSort(array);
      for (int i = 0; i < array.length; i++) {
         System.out.print(array[i] + " ");
      }
   }
}

Advantages and disadvantages of bubble sort

Advantages:
● Bubble sort is a very simple algorithm to understand and implement.
● Bubble sort is stable, meaning that it does not change the relative order of elements
that are equal.

Disadvantages:
● Bubble sort is not very efficient, especially for large arrays.
● Bubble sort is not a good choice for sorting arrays that are already partially sorted.

Selection Sort

Selection sort is a simple in-place comparison sorting algorithm in computer science. It worksby dividing the input list into two parts:

  1. sorted sublist
  2. unsorted sublist.

The algorithm repeatedly selects the smallest (or largest) element from the unsorted sublist and swaps it with the leftmost unsorted element, thereby expanding the sorted sublist. This process continues until the entire list is sorted. Selection sort has a time complexity of O(n^2), which makes it inefficient for large lists compared to more advanced algorithms like merge sort or quicksort.

The algorithm can be summarized in the following steps:

  1. Divide the input list into two sublists: the sorted sublist and the unsorted sublist.
  2. Initially, the sorted sublist is empty, and the unsorted sublist contains all the elements.
  3. Find the minimum (or maximum) element in the unsorted sublist.
  4. Swap the minimum (or maximum) element with the leftmost element in the unsorted sublist, putting it in its correct sorted position.
  5. Move the sublist boundaries one element to the right, expanding the sorted sublist and reducing the unsorted sublist.
  6. Repeat steps 3 to 5 until the entire list is sorted.

Image description

public class SelectionSort {
   public static void selectionSort(int[] array) {
      int n = array.length;
      for (int i = 0; i < n - 1; i++) {
         int minIndex = i;
         for (int j = i + 1; j < n; j++) {
            if (array[j] < array[minIndex]) {
               minIndex = j;
            }
         }
         int temp = array[minIndex];
         array[minIndex] = array[i];
         array[i] = temp;
      }
   }  
   public static void main(String[] args) {
      int[] array = {10, 8, 7, 6, 5, 4, 3, 2, 1};
      selectionSort(array);
      for (int i = 0; i < array.length; i++) {
         System.out.print(array[i] + " ");
      }
   }
}

Selection sort is renowned for its simplicity and offers certain benefits when auxiliary memory is constrained. Due to its quadratic time complexity, it typically performs worse than more
sophisticated sorting algorithms on big lists. In reality, the comparable algorithm insertion sort frequently performs better than selection sort.

Here are the advantages and disadvantages of selection sort:

Advantages:
● Simple to implement - Selection sort is a straightforward algorithm that can be implemented easily in any programming language. This makes it a good choice for beginners who are learning about sorting algorithms.
● In-place sorting algorithm: Selection sort does not require any additional memory to sort the data. This makes it a good choice for applications where memory is limited.
● Can be used for small data sets: Selection sort is a relatively efficient algorithm for small data sets. However, its performance degrades significantly for large data sets.

Disadvantages:
● Slow for large data sets
● Not stable: Selection sort is not a stable sorting algorithm. This means that it can change the relative order of elements with equal values. For example, if the array [1, 2, 2, 3] is sorted using selection sort, the output could be [1, 2, 3, 2].
● Not adaptive: Selection sort is not an adaptive sorting algorithm. This means that its performance does not improve as the data becomes more sorted. For example, if the array [1, 2, 3, 4] is sorted using selection sort, the algorithm will still need to perform the same number of comparisons as if the array was [1, 100, 1000, 10000].

Insertion sort

The basic sorting method known as "insertion sort" involves continuously adding elements into a subarray that has already been sorted. The way the method operates is by first presuming that
the array's first element has already undergone sorting. The method then compares each succeeding element to the items in the sorted subarray. The element is placed into the appropriate location if it is smaller than any of the other items in the sorted subarray. If not, the algorithm moves on to the following element.

Image description

public class InsertionSort {
   public static void sort(int[] array) {
      int n = array.length;
      for (int i = 1; i < n; i++) {
         int key = array[i];
         int j = i - 1;
         while (j >= 0 && array[j] > key) {
            array[j + 1] = array[j];
            j--;
         }
         array[j + 1] = key;
      }
   }
   public static void main(String[] args) {
      int[] array = {10, 8, 7, 6, 5, 4, 3, 2, 1};
      sort(array);
      for (int i = 0; i < array.length; i++) {
         System.out.print(array[i] + " ");
      }
   }
}

Advantages and disadvantages of insertion sort:

Advantages:
● Simple to implement
● In-place sorting algorithm
● Stable sorting algorithm (elements with equal keys retain their original order)
● Efficient for small lists

Disadvantages:
● Not as efficient for large lists
● Slow for already sorted or reverse-sorted lists
● Requires O(n^2) time in the worst case

Here are some examples of when insertion sort might be a good choice:

● Sorting a small list of numbers
● Sorting a list of strings that are already in alphabetical order
● Sorting a list of records that have a primary key field

Here are some examples of when insertion sort might not be a good choice:

● Sorting a large list of numbers
● Sorting a list of strings that are not in alphabetical order
● Sorting a list of records that do not have a primary key field

The post Data sorting appeared first on ProdSens.live.

]]>
https://prodsens.live/2024/04/12/data-sorting/feed/ 0
Leetcode Solution: #206: Reverse Linked List 🐬 https://prodsens.live/2024/03/21/leetcode-solution-206-reverse-linked-list-%f0%9f%90%ac/?utm_source=rss&utm_medium=rss&utm_campaign=leetcode-solution-206-reverse-linked-list-%25f0%259f%2590%25ac https://prodsens.live/2024/03/21/leetcode-solution-206-reverse-linked-list-%f0%9f%90%ac/#respond Thu, 21 Mar 2024 05:20:50 +0000 https://prodsens.live/2024/03/21/leetcode-solution-206-reverse-linked-list-%f0%9f%90%ac/ leetcode-solution:-#206:-reverse-linked-list-

Question Type: Medium 🎚️ Complexities: Time: O(n), Space: O(n) 🚩 Code: 👇 class Solution { public: ListNode* reverseList(ListNode*…

The post Leetcode Solution: #206: Reverse Linked List 🐬 appeared first on ProdSens.live.

]]>
leetcode-solution:-#206:-reverse-linked-list-

Question Type: Medium 🎚
Complexities: Time: O(n), Space: O(n) 🚩

Code: 👇

class Solution {
 public:
  ListNode* reverseList(ListNode* head) {
    if (!head || !head->next)
      return head;

    ListNode* newHead = reverseList(head->next);
    head->next->next = head;
    head->next = nullptr;
    return newHead;
  }
};

The post Leetcode Solution: #206: Reverse Linked List 🐬 appeared first on ProdSens.live.

]]>
https://prodsens.live/2024/03/21/leetcode-solution-206-reverse-linked-list-%f0%9f%90%ac/feed/ 0
Leetcode Solution: #1669 Merge In Between Linked Lists 🚀 https://prodsens.live/2024/03/20/leetcode-solution-1669-merge-in-between-linked-lists-%f0%9f%9a%80/?utm_source=rss&utm_medium=rss&utm_campaign=leetcode-solution-1669-merge-in-between-linked-lists-%25f0%259f%259a%2580 https://prodsens.live/2024/03/20/leetcode-solution-1669-merge-in-between-linked-lists-%f0%9f%9a%80/#respond Wed, 20 Mar 2024 07:20:46 +0000 https://prodsens.live/2024/03/20/leetcode-solution-1669-merge-in-between-linked-lists-%f0%9f%9a%80/ leetcode-solution:-#1669-merge-in-between-linked-lists-

Question Type: Medium 🎚️ Complexities: Time: O(n), Space: O(1) 🚩 Code: 👇 class Solution { public: ListNode* mergeInBetween(ListNode*…

The post Leetcode Solution: #1669 Merge In Between Linked Lists 🚀 appeared first on ProdSens.live.

]]>
leetcode-solution:-#1669-merge-in-between-linked-lists-

Question Type: Medium 🎚
Complexities: Time: O(n), Space: O(1) 🚩

Code: 👇

class Solution {
 public:
  ListNode* mergeInBetween(ListNode* list1, int a, int b, ListNode* list2) {
    ListNode* nodeBeforeA = list1;
    for (int i = 0; i < a - 1; ++i)
      nodeBeforeA = nodeBeforeA->next;

    ListNode* nodeB = nodeBeforeA->next;
    for (int i = 0; i < b - a; ++i)
      nodeB = nodeB->next;

    nodeBeforeA->next = list2;
    ListNode* lastNodeInList2 = list2;

    while (lastNodeInList2->next != nullptr)
      lastNodeInList2 = lastNodeInList2->next;

    lastNodeInList2->next = nodeB->next;
    nodeB->next = nullptr;
    return list1;
  }
};

Follow roshan_earth286 for More! ✌

The post Leetcode Solution: #1669 Merge In Between Linked Lists 🚀 appeared first on ProdSens.live.

]]>
https://prodsens.live/2024/03/20/leetcode-solution-1669-merge-in-between-linked-lists-%f0%9f%9a%80/feed/ 0
Is Data Struct about memory? https://prodsens.live/2023/08/27/is-data-struct-about-memory/?utm_source=rss&utm_medium=rss&utm_campaign=is-data-struct-about-memory https://prodsens.live/2023/08/27/is-data-struct-about-memory/#respond Sun, 27 Aug 2023 20:25:40 +0000 https://prodsens.live/2023/08/27/is-data-struct-about-memory/ is-data-struct-about-memory?

A principle that I try to follow is to think about obvious concepts. Since we are introduced to…

The post Is Data Struct about memory? appeared first on ProdSens.live.

]]>
is-data-struct-about-memory?

A principle that I try to follow is to think about obvious concepts. Since we are introduced to society, we meet with a lot of concepts that we don’t even question. Some authoritary figure like a teacher or a parent tells us things that we just accept.

If you think about data structure, you can relate it immediately to memory. Today, I want to discuss why data structures are not only about memory.

Introduction

For instance, let’s understand that we do not have any problem thinking about data structures to organize data in memory. The problem is never the idea that data structures are indifferent to implementation.

Data Structure is a way to organize, manage, and store data (yes, you can store things in math. If you think about matrices or sets, you can notice that is a way to organize data. In practice, store things is about to organize things). It is a collection of values that we can realize operations. In essence, it is an Algebraic Structure of data.

Algebraic Structure

An algebraic Structure is a set of elements with one or more operations that satisfy some specific properties (axioms). An example is Boolean Algebra, a set with two binary operations, union and intersection. You can follow this link to understand more about Boolean Algebra.

At this point, I want to highlight that algebraic structures are essentially about math. It describes the properties of a set of elements: a group, a ring, an algebra, etc. If you want to dive into this topic, read Algebraic Structures.

It is a big mistake to take an abstract and mathematical concept and reduce it to a simple implementation that needs physical and limited resources. You do not need any computer or memory to understand or use data structures. You need data. And data in math is just a set of elements.

Let’s to think about computational models

The root of this problem is that you maybe don’t know about computational models. It is a important step because if you think that in computer science we are using math to model computational problems, and computational problems are mathematical problems, you probably never will reduce any computational problem to a phisical and limited resource (like memory).

Computer science problems follow math problems, and we use math to model them. It means that the problem that you’re trying to solve, can be modeled by a mathematical problem before you start to think about some computational implementation.

Conclusion

I hope that you understand that data structures are not only about memory. It is a mathematical concept that we use to model computational problems.

This text is a little talk about some ideas that I would like to share. I tried to give some references to help if you want to dive into this topic. Also, it’s very important to think about obvious concepts and try to understand why the things are the way they are.

The post Is Data Struct about memory? appeared first on ProdSens.live.

]]>
https://prodsens.live/2023/08/27/is-data-struct-about-memory/feed/ 0
Product of Array Except Self – LeetCode Java Solution https://prodsens.live/2023/08/11/product-of-array-except-self-leetcode-java-solution/?utm_source=rss&utm_medium=rss&utm_campaign=product-of-array-except-self-leetcode-java-solution https://prodsens.live/2023/08/11/product-of-array-except-self-leetcode-java-solution/#respond Fri, 11 Aug 2023 11:24:28 +0000 https://prodsens.live/2023/08/11/product-of-array-except-self-leetcode-java-solution/ product-of-array-except-self-–-leetcode-java-solution

Hello readers, let’s solve a LeetCode problem today. In this blog, let’s solve the problem Product of Array…

The post Product of Array Except Self – LeetCode Java Solution appeared first on ProdSens.live.

]]>
product-of-array-except-self-–-leetcode-java-solution

Hello readers, let’s solve a LeetCode problem today.

In this blog, let’s solve the problem Product of Array Except Self which is one of the Blind 75 List of LeetCode Problems.

This is a LeetCode Medium problem with multiple potential solutions, which we will go through in this blog.

Understand the Problem

The Objective of the problem is to calculate the product of all elements in an input array, except for the element at each index. The problem also mentions writing an algorithm that runs in O(n) time and without using the division operation.

Understand the Testcases

To test the solution, we can consider the following cases:

  1. Input: [1, 2, 3, 4]
    Output: [24, 12, 8, 6]
    Explanation: The product of all elements except the one at index 0 is 2 * 3 * 4 = 24. The product of all elements except the one at index 1 is 1 * 3 * 4 = 12. Similarly, for index 2, the product is 1 * 2 * 4 = 8, and for index 3, the product is 1 * 2 * 3 = 6.
  2. Input: [4, 2, 1, 5, 3]
    Output: [30, 60, 120, 24, 40]
    Explanation: The product of all elements except the one at index 0 is 2 * 1 * 5 * 3 = 30. The product of all elements except the one at index 1 is 4 * 1 * 5 * 3 = 60. Similarly, for index 2, the product is 4 * 2 * 5 * 3 = 120, for index 3, the product is 4 * 2 * 1 * 3 = 24, and for index 4, the product is 4 * 2 * 1 * 5 = 40.
  3. Input: [1, 1, 1, 1, 1]
    Output: [1, 1, 1, 1, 1]
    Explanation: Since all elements are the same, the product of all elements except the one at each index will be the same element itself.
  4. Input: [0, 0, 0, 0]
    Output: [0, 0, 0, 0]
    Explanation: Since there are zero values in the input array, the product of all elements except the one at each index will be zero.
  5. Input: [2, 3, 0, 4]
    Output: [0, 0, 24, 0]
    Explanation: The product of all elements except the one at index 0 is 3 * 0 * 4 = 0. The product of all elements except the one at index 1 is 2 * 0 * 4 = 0. For index 2, the product is 2 * 3 * 4 = 24, and for index 3, the product is 2 * 3 * 0 = 0.

Brute Force Approach

The brute force approach will be to calculate the product of all values in the array except the current value using two nested iterations.

Key Points:

  • For each value, calculate the product of all other values by iterating through the array twice.
  • This approach uses two nested loops to multiply each element with all other elements, excluding itself.
  • However, this method is inefficient and may lead to a Time Limit Exceeded (TLE) error for larger test cases.
class Solution {
    public int[] productExceptSelf(int[] nums) {
        int size = nums.length;
        int[] answer = new int[size];
        for(int i=0; i<size; i++) {
            answer[i] = 1;
        }
        for(int i=0; i<size; i++) {
            for(int j=0; j<size; j++) {
                if(i==j) continue;
                answer[i] *= nums[j];
            }
        }
        return answer;
    }
}

Time Complexity: O(N^2)

  • The time complexity of the brute force approach is quadratic, as it involves nested loops that iterate through the array.

Space Complexity: O(1)

  • The brute force approach doesn’t require any additional space other than the answer array, so the space complexity is constant.

Better Approach

class Solution {
    public int[] productExceptSelf(int[] nums) {
        int size = nums.length;
        int[] answer = new int[size];
        int[] prefix =  new int[size];
        int[] suffix =  new int[size];
        int temp = 1;
        for(int i=0; i<size; i++) {
            prefix[i] = temp;
            temp *= nums[i];
        }
        temp = 1;
        for(int j=size-1; j>=0; j--) {
            suffix[j] = temp;
            temp *= nums[j];
        }
        for(int i=0; i<size; i++) {
            answer[i] = prefix[i] * suffix[i];
        }
        return answer;
    }
}

Key Points:

  • The better approach uses separate iterations to calculate prefix and suffix values for each element.
  • In this approach, we calculate the prefix products while traversing the array from left to right and storing them in the prefix array.
  • Then, we calculate the suffix products while traversing the array from right to left and storing them in the suffix array.
  • Finally, we multiply the corresponding prefix and suffix values to get the final products.

Time Complexity: O(N)

  • The better approach performs two linear passes through the array, resulting in a linear time complexity.

Space Complexity: O(N)

  • The better approach requires additional space to store the prefix and suffix arrays, resulting in linear space complexity.

Optimal Approach

class Solution {
    public int[] productExceptSelf(int[] nums) {
        int[] answer = new int[nums.length];

        int prefix = 1;
        for(int i=0; i<nums.length; i++) {
            answer[i] = prefix;
            prefix *= nums[i];
        }

        int suffix = 1;
        for(int j=nums.length-1; j>=0; j--) {
            answer[j] *= suffix;
            suffix *= nums[j];
        }

        return answer;
    }
}

Key Points:

  • The optimal approach further improves efficiency by using the output array to store the prefix products.
  • In this approach, we calculate the prefix products while traversing the array from left to right.
  • Then, we calculate the suffix products while traversing the array from right to left.
  • Finally, we multiply the suffix product with the corresponding prefix product to obtain the final answer.

Time Complexity: O(N)

  • The optimal approach also performs two linear passes through the array, resulting in a linear time complexity.

Space Complexity: O(1)

  • The optimal approach doesn’t require any additional space other than the output array, so the space complexity is constant.

Conclusion

  • The brute force solution has a time complexity of O(N²) but no extra memory usage.
  • Using two extra arrays improves the time complexity to O(N) but also requires extra memory with the complexity of O(N).
  • Using the linear traversals and answer array offers the best time complexity of O(N) but doesn’t require any additional space other than the output array, so the space complexity is constant, O(1).

NeetCode Solution Video

https://www.youtube.com/watch?v=bNvIQI2wAjk

Basics of DS Algo Blogs:

1. Hashing in Java

Recommended YouTubers for LeetCode Problems:

1. NeetCode

2. Take U Forward

Free Resources for Learning Data Structures and Algorithms:

1. NeetCode Roadmap

2. Striver’s SDE Sheet

Recommended Courses for Learning Data Structures and Algorithms:

  1. NeetCode Courses
  2. ZTM: Mastering the Coding Interview (Big Tech): Available on Udemy and ZTM Academy
  3. ZTM: Mastering the Coding Interview: Available on Udemy and ZTM Academy
  4. Data Structures & Algorithms, Level-up for Coding Interviews Course
  5. Striver’s A2Z (Free) Course

Top Coursera Courses for Learning Data Structures and Algorithms:

  1. Coding Interview Preparation (Meta)
  2. Algorithms Course Part I (Princeton University)
  3. Algorithms Course Part II (Princeton University)
  4. Data Structures and Algorithms Specialization (UC San Diego)
  5. Algorithms Specialization (Stanford)

(Note: The Coursera courses can be audited to get free access to the lectures)

🎙 Disclosure: Please note that some of the links mentioned on this page may be affiliate links. This means that if you click on one of these links and make a purchase, I may earn a small commission from the sale.

Who Am I?

I’m Aswin Barath, a Software Engineering Nerd who loves building Web Applications, now sharing my knowledge through Blogging during the busy time of my freelancing work life. Here’s the link to all of my craziness categorized by platforms under one place: https://linktr.ee/AswinBarath

Keep Learning

Now, I guess this is where I say goodbye 👋. But, hey it’s time for you to start learning with your newfound Knowledge(Power)👨‍💻👩‍💻. Good Job that you made it this far 👏Thank you so much for reading my Blog 🙂.

The post Product of Array Except Self – LeetCode Java Solution appeared first on ProdSens.live.

]]>
https://prodsens.live/2023/08/11/product-of-array-except-self-leetcode-java-solution/feed/ 0
Caching Git Repos: A Deep Dive into OpenSauced’s ‘Pizza Oven’ Service https://prodsens.live/2023/08/09/caching-git-repos-a-deep-dive-into-opensauceds-pizza-oven-service/?utm_source=rss&utm_medium=rss&utm_campaign=caching-git-repos-a-deep-dive-into-opensauceds-pizza-oven-service https://prodsens.live/2023/08/09/caching-git-repos-a-deep-dive-into-opensauceds-pizza-oven-service/#respond Wed, 09 Aug 2023 13:25:21 +0000 https://prodsens.live/2023/08/09/caching-git-repos-a-deep-dive-into-opensauceds-pizza-oven-service/ caching-git-repos:-a-deep-dive-into-opensauced’s-‘pizza-oven’-service

Over the last few weeks, the OpenSauced engineering team has been building a service we’re calling the “pizza…

The post Caching Git Repos: A Deep Dive into OpenSauced’s ‘Pizza Oven’ Service appeared first on ProdSens.live.

]]>
caching-git-repos:-a-deep-dive-into-opensauced’s-‘pizza-oven’-service

Over the last few weeks, the OpenSauced engineering team has been building a service we’re calling the “pizza oven.” This service indexes commits within bespoke git repositories and can be used to generate insights based on those commits. This all gives us the ability to create interesting metrics around open source project velocity, “time to merge”, the who’s who of contributors, and more; all by indexing and parsing the git commits! We’ve been experimenting with many different models and have created an interesting solution for increased performance and availability of the service.

Initially, as a proof of concept, in order to index individual commits in a git repo, the pizza oven would do the most basic thing: clone the repo directly into memory and parse through all of its commits, inserting new commits into a configured database.

The hot path on the server would include this Go code which clones the git repo directly into memory:

inMemRepo, err := git.Clone(memory.NewStorage(), nil, &git.CloneOptions{
   URL:         URL,
   SingleBranch: true,
})

But there’s an obvious performance bottleneck: git cloning repos can be very slow. A single large git repository can contain tens of thousands, if not hundreds of thousands of objects that need to be fetched from a remote source and uncompressed. Further, re-cloning and re-parsing the git repo every time it is queried from the service is an exhaustive waste of compute resources, especially if there are already existing commits that have been indexed. And finally, this service would need to be supported by compute instances with significant amounts of expensive volatile memory in order to clone many different repos concurrently at scale.

There has to be a better way!

Enter the LRU cache: an efficient way to keep frequently queried items readily available for processing without having to always rebuild the in-memory data structures. An LRU cache is a “Least Recently Used” cache that evicts items that have not been processed or “hit” recently.

You can sort of think of it like a “priority” queue where the members of the queue are bumped to the front of the line whenever they are called upon. But members of the queue that fall to the back of the line eventually get evicted based on certain constraints. For a very basic example, if an LRU cache can only be 10 members long, whenever a new entry arrives, it is immediately put to the front of the queue and the last member is evicted if the size now surpasses 10.

More practically speaking, an LRU cache is implemented with two common data structures: a doubly linked list and a hashmap. The hashmap keeps a fast record of key / value pairs within the cache and the doubly linked list tracks the “positioning” of each item within the cache. Using both of these, you can efficiently create a system where items in the hashmap can be quickly retrieved and positioning is determined by the front and back of the doubly linked list.

In Go, we can use the standard library “list.List” as the doubly linked list and the typically “map” as the hashmap:

type GitRepoLRUCache struct {
    // a doubly linked list to support the LRU cache behavior
    dll *list.List

    // a hashmap to support the LRU cache behavior
    hm map[string]*list.Element
}

The “Element” in this implementation of an LRU cache can really be anything you want. In our case, we decided to track git repositories on disk that have already been cloned. This gets around the constraint of having large chunks of memory used up through cloning directly into memory. Instead, we can index repos already on disk or clone new ones as requests come in.

For this, we created the GitRepoFilePath struct that denotes a key/value pair which points to a filepath on disk where the repo has already been cloned.

type GitRepoFilePath struct {
    // The key for the GitRepoFilePath key/value pair
    // generally, is the remote URL for the git 
    // repository
    key string

    // path is the value in the GitRepoFilePath
    // key/value and denotes the filepath
    // on-disk to the cloned git repository
    path string
}

Using the LRU cache, its data structures, and the GitRepoFilePath as the “Elements” in the cache, frequently used git repos on disk can be easily, cleanly, and efficiently updated without having to re-clone them.

Typically, there are two methods that make up an LRU cache’s API: “Get” and “Put”. Both may be obvious, but “Get” returns a member from the cache based on its key, placing that returned item to the front of the doubly linked list. If the queried key in the cache is not present, “Get” returns a nil Element:

func (c *GitRepoLRUCache) Get(key string) *GitRepoFilePath {

“Put” is abit more complicated and is where alot of the magic ends up happening: when a key/value pair are “Put” to the cache, first the cache must evict members based on its criteria.

But what sort of eviction criteria makes sense for a cache of git repositories? Well, with a service that caches git repos onto disk, the obvious metric to track is “free space” on that disk. Someone deploying this service can configure the amount of free disk they anticipate always needing to be available and can also configure the specific directory they want to use as the cache on the system. This provides a nice buffer to prevent the disk from completely filling up and potentially causing its own problems.

During “Put”, when cloning a new repo and putting it into the cache, if the amount of used disk surpasses the configured “free disk” space, the LRU cache will evict repos and delete them from the disk. This process continues until the configured “free disk” is less than the actual amount of free disk space at the designated cache path.

In the Go code, this is what the function signature ends up looking like:

func (c *GitRepoLRUCache) Put(key string) (*GitRepoFilePath, error)

This all works really well in a single threaded model, but sort of falls apart when you need to concurrently serve many different requests. What happens when a request comes in for the same repo at the same time? How can the cache handle multiple requests at the same time?

With a few tweaks and modifications, we can make this cache implementation thread safe!

First, we need to enable cache operations to be atomic. We can do this by adding a mutex lock to the cache itself:

type GitRepoLRUCache struct {
    // A locking mutex for atomic operations
    lock sync.Mutex

    // a doubly linked list to support the LRU cache behavior
    dll *list.List

    // a hashmap to support the LRU cache behavior
    hm map[string]*list.Element
}

This mutex on the cache can then be locked and unlocked during atomic cache operations.

Let’s look at the “Get” method for how this all works. When “Get” is called, the cache’s mutex is locked, allowing operations to continue. This call to “c.lock.Lock()” will block until the mutex is in an unlocked state which indicates other threads are done operating on the cache:

func (c *GitRepoLRUCache) Get(key string) *GitRepoFilePath {
    // Lock (and unlock when done) the cache's mutex
    c.lock.Lock()
    defer c.lock.Unlock()

    if element, ok := c.hm[key]; ok {
     // Cache hit
     c.dll.MoveToFront(element)
     return element.Value.(*GitRepoFilePath)
    }

    // Cache miss
    return nil
}

The defer c.lock.Unlock() is a nice way in Go of making sure that the mutex is always unlocked before this function scope closes. The worst thing possible here is if a dead-lock occurs where a thread never unlocks the cache’s mutex and no other threads can then operate on the cache, hanging when they call c.lock.Lock().

This ensures that the cache itself is thread safe, but what about the individual elements within the cache? If cache operations themselves are really fast, isn’t there a possibility that an Element could be evicted before its git operations have completed? Eviction of an element during processing of git commits would be really bad since this entails removing the git repo from disk entirely which would cause an unrecoverable state of the indexed commits.

One solution would be to just extend the cache’s mutex to not unlock until processing on individual elements has finished. But the astute concurrent programmer will see that this returns the cache to a single threaded data structure without any real ability to do concurrent operations.

Instead, the individual GitRepoFilePath elements can also have a locking mutex:

type GitRepoFilePath struct {
    // Locking mutex for atomic file operations
    lock sync.Mutex

    // The key for the GitRepoFilePath key/value pair
    // generally, is the remote URL for the git 
    // repository
    key string

    // path is the value in the GitRepoFilePath
    // key/value and denotes the filepath
    // on-disk to the cloned git repository
    path string
}

Now, when elements are returned from the cache operations, they themselves can be locked to prevent deadlocks or removal before they have finished processing. Let’s look at the “Get” method again to see how it works with locking the individual element when a cache hit occurs:

func (c *GitRepoLRUCache) Get(key string) *GitRepoFilePath {
    // Lock (and unlock when done) the cache's mutex
    c.lock.Lock()
    defer c.lock.Unlock()

    if element, ok := c.hm[key]; ok {
     // Cache hit
     c.dll.MoveToFront(element)

        // Lock the git repo filepath element
     element.Value.(*GitRepoFilePath).lock.Lock()
     return element.Value.(*GitRepoFilePath)
    }

    // Cache miss
    return nil
}

Notice that the queried element is locked before it is returned. Then, latter, after the caller has finished processing the returned GitRepoFilePath, they can call the Done method. This is a simple, thin wrapper around unlocking the mutex but ensures that any consumer of a GitRepoFilePath can “clean up” their state once processing has finished.

func (g *GitRepoFilePath) Done() {
    g.lock.Unlock()
}

A similar structuring of locking and unlocking these mutexes in “Put” and during the eviction process, all working together, allows for the cache and its elements to be thread safe and concurrently operated on.

At scale, using this LRU caching method, we can prevent the re-cloning of frequently queried git repos and speed up the service drastically. Make sure to check out the open source code for this service and all the details on this implementation of an LRU cache!

Stay Saucy! 🍕

The post Caching Git Repos: A Deep Dive into OpenSauced’s ‘Pizza Oven’ Service appeared first on ProdSens.live.

]]>
https://prodsens.live/2023/08/09/caching-git-repos-a-deep-dive-into-opensauceds-pizza-oven-service/feed/ 0
Array Strengths, Weaknesses, and Big-O Complexity Analysis https://prodsens.live/2023/07/22/array-strengths-weaknesses-and-big-o-complexity-analysis/?utm_source=rss&utm_medium=rss&utm_campaign=array-strengths-weaknesses-and-big-o-complexity-analysis https://prodsens.live/2023/07/22/array-strengths-weaknesses-and-big-o-complexity-analysis/#respond Sat, 22 Jul 2023 04:25:47 +0000 https://prodsens.live/2023/07/22/array-strengths-weaknesses-and-big-o-complexity-analysis/ array-strengths,-weaknesses,-and-big-o-complexity-analysis

Read about Array Data Structures article before reading through this article. Arrays are fundamental data structures in computer…

The post Array Strengths, Weaknesses, and Big-O Complexity Analysis appeared first on ProdSens.live.

]]>
array-strengths,-weaknesses,-and-big-o-complexity-analysis

Read about Array Data Structures article before reading through this article.

Arrays are fundamental data structures in computer science and programming, offering a range of strengths, weaknesses, and Big-O complexities that impact their efficiency and usability.

Understanding the characteristics of arrays is crucial for choosing the right data structure for specific tasks and optimizing program performance.

In this article, we delve into arrays’ strengths, weaknesses, and Big-O complexities. We explore the benefits arrays provide, such as random and sequential access, simplicity of implementation, and cache-friendliness. Simultaneously, we address their limitations, including a fixed size, insertion, and deletion operations challenges, and inflexibility.

Additionally, we discuss the time complexities associated with common array operations, such as access, search, insertion, deletion, and resizing. By gaining insights into these aspects, programmers can make informed decisions when utilizing arrays and effectively balance trade-offs between efficiency and functionality in their applications.

Strengths

There are many advantages to using arrays, some of which are outlined below:

  1. Fast lookups (random access)
  2. Fast appends
  3. Simple implementation
  4. Cache friendliness

1. Fast lookups (Random access)

Retrieving the element at a given index takes O(1) time, regardless of the array’s length.

For example:

int[] A = {-1, 7, 9, 100, 1072};

The array has five elements, and the length of the array is calculated using, A.length, which is 5. As arrays are zero indexed, the last element is accessed using A[A.length - 1] which is A[4] as shown in the following sketch.

Array items

If we access array elements using the index, like A[0] or A[4], it takes a single unit of time or, in big-o terms, constant operation.

A[0]; // -1
A[3]; // 100
A[4]; // 1072
A[5]; // Index out of bounds exception, as there is no A[5] value.

All of the above operations consume a single unit of time which is O(1) time.

2. Fast appends

Adding a new element at the end of the array takes O(1) time if the array has space.

Let us create an array with a capacity 5 and insert values 100 and 101 at indexes 0 and 1.

The following code explains it.

int[] A = new int[5];

A[0] = 100;
A[1] = 101;

Array with capacity

Now if we were to insert a new value into the array, we could do A[2] = 200. Which inserts value 200 at the index 2. This operation consumes a single unit of time which is constant.

Array item inserted

This is the reason the appends at the end are fast.

Enough talk. Here is a simple algorithm that creates an array with a size 5, and inserts 100, 101 values into the array. Finally, we insert an element at the array length.

import java.util.Arrays;

public class ArrayAppends {
  public static void main(String[] args) {
    int[] A = new int[5];
    int currentLength = 0;

    // Let us add 2 elements to the array
    for (int i = 0; i < 2; i++) {
      A[i] = i + 100;
      currentLength++; // when i=1, length is set to 2
    }

    System.out.println(Arrays.toString(A)); // [100, 101, 0, 0, 0]
    System.out.println("current array items length " + currentLength); // 2
    System.out.println("Array capacity " + A.length); // 5
    System.out.println(
        "Element insert at end "
            + Arrays.toString(insertAtEnd(A, currentLength))); // [100, 101, 200, 0, 0]
  }

  // Inserting element at the end
  public static int[] insertAtEnd(int[] A, int currentLength) {
    A[currentLength] = 200;
    return A;
  }
}

/* 
Outputs:
[100, 101, 0, 0, 0]
current array items length 2
Array capacity 5
Element insert at end [100, 101, 200, 0, 0]
*/

3. Simple Implementation

Arrays have a straightforward implementation in most programming languages, making them easy to understand and use.

4. Cache Friendliness

Elements in an array are stored contiguously in memory, which improves cache performance and can lead to faster access times.

Array occupies contiguous memory

Weaknesses

There are some dis-advantages of using arrays, some of which are outlined below:

  1. Fixed size.
  2. Memory unused or wasted.
  3. Size doubling.
  4. Costly inserts
  5. Costly deletes

1. Fixed-size

Arrays have a fixed size defined at the time of creation. Adding or removing elements beyond the initial size requires creating a new array and copying the existing elements, which can be inefficient.

You need to specify how many elements you will store in your array ahead of time. (Unless you’re using a fancy dynamic array).

int[] A = new int[5]; // contains 5 elements

2. Memory unused or wasted

If an array’s size is larger than the number of elements it contains, memory is wasted.

Imagine an array with a capacity of 5. We have two elements to store in this array, and then we are wasting three unfilled cells and a waste of memory, which means 3*(4 bytes) = 12 bytes of memory is wasted (integer takes 4 bytes).

Array Capacity

3. Size doubling

Let us consider an array with a capacity of 5 elements. But the elements we want to store in this array are more, which means we have to double the size, create a new array, copy the old array elements and add new elements. The time complexity is O(n).

Array size doubling issues

You will learn how to double the array size in the next lessons.

4. Costly inserts

Inserting/appending an element at the end of the array takes O(1) time. We have seen this in the strengths(fast appends).

But, inserting an element at the start/middle of the array takes O(n) time. Why? 🤔

If we want to insert something into an array, first, we have to make space by “scooting over” everything starting at the index we’re inserting into, as shown in the image. In the worst case, we’re inserting into the 0th index in the array (prepending), so we have to “scoot over” everything. That’s O(n) time.

Array insert and shifting algorithms

Inserting an element at the 2nd index and moving the rest of the element right shift each once. The resultant array becomes – { A, B, C, D, E }.

In the next lessons, you will learn more about insertion and shifting algorithms, with clear explanations, code snippets, and sketches to understand why these inserts are expensive at the start and middle.

5. Costly deletes

Deleting an element at the end of the array takes O(1) time, which is the best case. In computer science, we only care about the worse case scenarios when working on algorithms. But, when we remove an element from the middle or start of the array, we have to fill the gap by scooting over all the elements after it. This will be O(n) if we consider a case of deleting an element from the 0th index.

Array delete and shifting algorithms

Deleting an element at the 3rd index and filling the gap by left-shifting the rest of the elements; the resultant array becomes – { A, B, C, D, E }.

Big-O Complexities

Operation Complexity Explanation
Lookup/Access a value at a given index O(1) Accessing an element by its index is a constant-time operation.
Search an element in an array O(N) Searching for a specific element in an unsorted array requires iterating through
each element in the worst case.
Update a value at a given index O(1) Updating any element at any given index is always constant time.
Insert at the beginning/middle O(N) Inserting an element at the beginning or middle of the array requires shifting the
existing elements, resulting in a linear time complexity.
Append at the end O(1) If the array has space available, inserting an element at the end takes constant
time.
Delete at the beginning/middle O(N) Deleting an element from the beginning or middle of the array requires shifting
the remaining elements, resulting in a linear time complexity.
Delete at the end O(1) Deleting the last element of an array can be done in constant time.
Resize array O(N) Resizing an array requires creating a new array and copying the existing elements,
which takes linear time.

The the Big-O complexities mentioned above are for basic operations and assume an unsorted array. Some specialized data structures, such as Heaps or HashTable’s, can provide more efficient alternatives for specific use cases.

In the next lessons, you will learn more about array capacity vs. length, insertions, and deletion algorithms. They are the simplest yet most powerful and helps you when working on array problems.

The post Array Strengths, Weaknesses, and Big-O Complexity Analysis appeared first on ProdSens.live.

]]>
https://prodsens.live/2023/07/22/array-strengths-weaknesses-and-big-o-complexity-analysis/feed/ 0
Disjoint Set Union heuristics https://prodsens.live/2023/04/02/disjoint-set-union-heuristics/?utm_source=rss&utm_medium=rss&utm_campaign=disjoint-set-union-heuristics https://prodsens.live/2023/04/02/disjoint-set-union-heuristics/#respond Sun, 02 Apr 2023 01:01:59 +0000 https://prodsens.live/2023/04/02/disjoint-set-union-heuristics/ disjoint-set-union-heuristics

DSU is one of the most elegant in implementation data structure and I’ve used in my competitive programming…

The post Disjoint Set Union heuristics appeared first on ProdSens.live.

]]>
disjoint-set-union-heuristics

DSU is one of the most elegant in implementation data structure and I’ve used in my competitive programming life for many many time. Internet is full of various implementations for it, but unfortunately there are almost no article with a good proof of DSU efficiency, in this article I will do by best to uncover this secret.

Trivial Disjoint Set Union data structure can be implemented in a following way

class DSU 
{
    private int[] parent;

    void createSet(int vertex)
    {
        parent[vertex] = vertex;
    } 

    bool isRepresentative(int vertex)
    {
        return parent[vertex] == vertex;
    }

    void findRepresentative(int vertex)
    {
        if (!isRepresentative(vertex))
        {
            return findRepresentative(parent[vertex]);
        }

        return vertex;
    } 


    void mergeSets(int lhs, int rhs)
    {
        int rhsRepresentative = findRepresentative(rhs);
        int lhsRepresentative = findRepresentative(lhs);

        if (lhsRepresentative != rhsRepresentative)
        {
            parent[lhsRepresentative] = rhsRepresentative;
        }
    }
}

Let’s start with two trivial heuristic DSU has

Tree depth rank heuristic

The following heuristic suggests that we should attach a set-tree with smaller depth to a set-tree with larger depth.

void createSet(int vertex) 
{
    parent[vertex] = vertex;
    rank[vertex] = 1;
}

void mergeSets(int lhs, int rhs)
{
    int rhsRepresentative = findRepresentative(rhs);
    int lhsRepresentative = findRepresentative(lhs);

    if (rhsRepresentative != lhsRepresentative)
    {
        if (rank[lhsRepresentative] < rank[rhsRepresentative]) 
        {
            swap(lhsRepresentative, rhsRepresentative);
        }

        parent[rhsRepresentative] = lhsRepresentative;
        if (depth[lhsRepresentative] == depth[rhsRepresentative]) 
        {
            rank[lhsRepresentative] += 1;
        }
    }
}

Let's show that this heuristic will help to reduce the findRepresentative complexity to O(log(N)).

We can do this by proving that if set-tree rank is equal to K, then this tree contains at least 2^K vertices and has depth K. We will use induction on K.

If K = 1, then size of the tree is 1 and the depth is 1.

Let's understand how we get a set-tree with rank equal K. It happens if we merge two set-trees of the same K-1 ranks. But we know that for set-trees with K-1 rank we have depth K-1, it means that a new set-tree for rank K is going to contains at least 2 * 2^(K - 1) = 2^K vertices and have a depth of at most K.

Tree size rank heuristic

Similar heuristic, but it suggests to attach a smaller set-tree to a larger one.

void createSet(int vertex) 
{
    parent[vertex] = vertex;
    size[vertex] = 1;
}

void mergeSets(int lhs, int rhs)
{
    int rhsRepresentative = findRepresentative(rhs);
    int lhsRepresentative = findRepresentative(lhs);

    if (rhsRepresentative != lhsRepresentative)
    {
        if (size[lhsRepresentative] < size[rhsRepresentative]) 
        {
            swap(lhsRepresentative, rhsRepresentative);
        }

        parent[rhsRepresentative] = lhsRepresentative;
        size[lhsRepresentative] += size[rhsRepresentative];
    }
}

In a similar way we can prove that if size of set-tree is K, then its height is log(K).

If K = 1, that's obviously true.

Image description

Let's look at two trees Tree_k1 of size k1 and Tree_k2 of size k2. We can tell that Tree_k1 has height log(k1) and Tree_k2 has height log(k2).

Image description

Let's say that k1 >= k2, then Tree_k2 will be attached to Tree_k1 and the new height of the tree will be max(log(k1), log(k2) + 1).

Image description

It's very easy to show now that new height h is smaller than log(k1 + k2).

Image description

Now let's take a look at a little more interesting heuristic

Tree path compression heuristic

Let's consider the following optimisation of findRepresentative function

public int findRepresentative(int vertex)
{
    if (!isRepresentative(vertex)) 
    {
        parent[vertex] = findRepresentative(
            parent[vertex]
        );
    }

    return parent[vertex];
}

In this code we're removing all edges on a path from vertex to the root (representative) of the set-tree connecting vertices on the path to the root.

First observation, that this alone can still produce a tree of depth O(N). For this we always connect root of a set-tree to an set-tree of one vertex.

Image description

In the image above findRepresentative is always called for set-tree root, so the complexity of this call is O(1).

Let's consider a case where findRepresentative is called from non set-tree root and understand why it will have O(log N) complexity in average. To do so lets start with introducing a category for the edges in the set-tree.

Edge (a, b) where a is a parent of b has category 2^K if 2^K <= size(a) - size(b) < 2^(K + 1).

Let's look at the path that will be removed when calling findRepresentative(x) (marked with yellow)

Image description

If this path has log(N) edges, we're happy with that as we want to prove O(log N) complexity.

Let's say there are more than log(N) edges. In this case we will find at least two edges with the same category K (
Pigeonhole principle
)
Image description

Let's define size(a / b) as size of sub-tree a without sub-tree of b.

We can easily see that size(u / v) >= 2^K + size(v) and size(a / b) >= 2^K + size(b). But we can say even more! v contains in its sub-tree at least size(a / b) vertices, this means that size(u / v) >= size(a / b) + 2^K >= size(b) + 2^K + 2^K.

Image description

Let's look more closely in the first edge of category 2^K on the path. With the observation above we can understand what will be the category of edge (R, v) that we add instead of (u, v) as part of path compression. Size of R sub-tree is at least size(u / v) + size(v) which is more than 2^(K + 1) + size(v) and the size of v is size(v), so size(R / v) >= 2^(K + 1). This means that if there are more then log(N) edges on a path from representative (R) to the vertex x, then there will be at least on edge with increase category. The category is limited by the size of the tree so it cannot grow indefinitely.

Image description

Let's show that when we're replacing an edge on the path with an edge to the root, category of that edge can't decrease. Let's say that we're replacing edge (u, v) with edge (R, v). Because R is an ancestor of u it will have at least one more vertex in its sub-tree thus size(u, v) < size(R, v) + 1.

All other edges won't change their category, because the edge replacement either doesn't affect them or size of both vertices of that edges are decreased by the same number.

So knowing that category of edges is only growing and has a limit we can say that it won't be more than (N - 1) * log(N) (every edge in set-tree of size N can increase its category at most log(N)). Combining with a scenario when path from a vertex to representative is less or equal to log(N) we got total complexity for M operations findRepresentative O((M + N)log(N)).

Combination of path compression with rank heuristic

The proof for this is rather complex, but I believe it worth mentioning that complexity of this heuristic speeds up the algorithm to O(ackermann(n)), where ackermann is reversed Ackermann function that growth extremely slow (for example for n <= 10^500, ackermann(n) < 4).

Summary

In this article we took a look at a fascinating data structure DSU and we proved the most common heuristics used in it and mentioned one extremely efficient heuristic that combines two of them.

The post Disjoint Set Union heuristics appeared first on ProdSens.live.

]]>
https://prodsens.live/2023/04/02/disjoint-set-union-heuristics/feed/ 0