Session 1#
instructor -#
Data Structures#
Here’s a quick overview of the basic data structures in Python:
1. Lists#
- Ordered, mutable collections of elements.
- Defined with square brackets:
my_list = [1, 2, 3]
- Allows indexing, slicing, and modification of elements.
- Common methods:
append()
,remove()
,pop()
,sort()
.
2. Tuples#
- Ordered, immutable collections of elements.
- Defined with parentheses:
my_tuple = (1, 2, 3)
- Cannot be modified after creation.
- Useful for fixed data and function returns.
3. Dictionaries#
- Collections of key-value pairs, unordered in older versions (pre-3.7).
- Defined with curly braces:
my_dict = {"name": "Alice", "age": 25}
- Mutable, allowing in-place modification of key-value pairs.
- Common methods:
get()
,keys()
,values()
,update()
.
4. Sets#
- Unordered collections of unique elements.
- Defined with curly braces:
my_set = {1, 2, 3}
- Mutable; no duplicate elements allowed.
- Useful for membership testing and removing duplicates from lists.
- Common methods:
add()
,remove()
,union()
,intersection()
.
5. Strings#
- Ordered, immutable sequences of characters.
- Defined with quotes:
my_string = "hello"
- Cannot be modified, but can be sliced and concatenated.
- Common methods:
upper()
,lower()
,replace()
,split()
.
Each of these structures offers distinct characteristics suitable for different types of data management and manipulation in Python.
Mutable and Immutable#
in Python, data types can be classified as mutable or immutable based on whether their values can be changed after creation.
Mutable Data#
Mutable data types are objects that can be modified after they are created. This means you can change their content without creating a new object in memory. Examples of mutable data types include:
- Lists: You can change elements, add or remove items.
my_list = [1, 2, 3] my_list[0] = 10 # Modifies the first element
- Dictionaries: You can modify values associated with keys or add new key-value pairs.
my_dict = {"a": 1, "b": 2} my_dict["a"] = 10 # Changes the value of "a"
- Sets: You can add or remove items.
my_set = {1, 2, 3} my_set.add(4) # Adds an element to the set
Key Points:#
- Mutable objects allow modification without creating a new object.
- Modifying a mutable object can affect all references to that object.
Immutable Data#
Immutable data types, on the other hand, cannot be changed after they are created. Any modification results in a new object being created in memory. Examples of immutable data types include:
- Integers: Changing the value creates a new integer object.
x = 5 x = x + 1 # Creates a new integer object for x
- Floats: Similar to integers; any modification results in a new float object.
- Strings: Strings cannot be altered; concatenation or modification results in a new string.
my_string = "hello" my_string = my_string + " world" # Creates a new string object
- Tuples: Once created, elements in a tuple cannot be changed.
my_tuple = (1, 2, 3) # my_tuple[0] = 10 # This will raise an error
Key Points:#
- Immutable objects cannot be modified after creation.
- Each change creates a new object in memory, so references to previous values remain unaffected.
Why Use Immutable vs Mutable Types?#
- Mutable types are useful when you need to change data frequently or add/remove items.
- Immutable types are safer for use in multi-threaded environments because they don’t change, making them more predictable and less error-prone. They also work well as dictionary keys or set elements because they cannot change after being created.
Immutable and Mutable for Dictionaries#
In Python, dictionaries are a versatile, mutable data type. They store data in key-value pairs and allow efficient data retrieval, insertion, and modification. Since dictionaries are mutable, they offer flexibility in managing dynamic data structures. However, the keys used within a dictionary must be immutable, which introduces interesting considerations about immutability in dictionary use.
Mutable Characteristics of Dictionaries#
-
Modifiable Elements: You can add, update, or delete key-value pairs in a dictionary.
my_dict = {"name": "Alice", "age": 25} my_dict["age"] = 26 # Modify value associated with "age" my_dict["location"] = "NY" # Add new key-value pair del my_dict["name"] # Delete a key-value pair
Each of these operations changes the dictionary in place without creating a new object in memory. -
In-Place Modification: Operations such as
.update()
,.pop()
, andclear()
modify the dictionary directly.
my_dict.update({"age": 27, "job": "Engineer"}) my_dict.pop("location")
-
Performance Implications: Mutable dictionaries allow for efficient in-place updates, which is more memory-efficient than creating new copies. However, mutable types can introduce complexity when dictionaries are shared among functions or threads since changes to a dictionary will reflect across all references.
Immutable Keys in Dictionaries#
While dictionaries themselves are mutable, their keys must be immutable. This is essential because dictionary keys need to remain consistent for proper hashing and retrieval. If a key could change, it would make locating the value associated with that key unreliable.
- Valid Immutable Keys: Commonly used immutable types for dictionary keys include strings, numbers, and tuples.
my_dict = { "name": "Alice", # String key 42: "Answer to everything", # Integer key (1, 2): "Tuple Key" # Tuple key (contains immutable elements) }
- Invalid Keys: Lists, dictionaries, or other mutable types cannot serve as dictionary keys, as their contents can change, leading to inconsistent hashing.
my_dict = {[1, 2, 3]: "List Key"} # This will raise a TypeError
Mutable and Immutable Values in a Dictionary#
Dictionaries can store both mutable and immutable types as values:
-
Immutable Values: If the value is immutable (e.g., a number or string), any reassignment creates a new object without affecting the dictionary itself. Reassigning an immutable value simply updates the reference for that key.
my_dict = {"name": "Alice", "age": 25} my_dict["age"] = 26 # Changes "age" to reference a new integer (26)
-
Mutable Values: If the value associated with a key is mutable (e.g., a list or another dictionary), changes made to that value affect the dictionary directly since it points to the same object.
my_dict = {"grades": [85, 90, 95]} my_dict["grades"].append(100) # Modifies the list in place
Here, adding an element to the list stored inmy_dict["grades"]
affects the dictionary, as it still references the same list object in memory.
Practical Considerations for Dictionaries and Mutability#
-
Shared References: Be cautious when storing mutable objects as dictionary values, especially if these values will be referenced across multiple parts of your program. Modifying a mutable value will reflect across all references to it.
-
Deep Copying: To avoid unintended modifications to mutable values within dictionaries, use
copy.deepcopy()
to make a full independent copy of nested dictionaries.
import copy my_dict = {"grades": [85, 90, 95]} my_dict_copy = copy.deepcopy(my_dict) my_dict_copy["grades"].append(100) # This won’t affect `my_dict`
-
Use as Caching Mechanism: Dictionaries are commonly used as caches or lookups, where mutability enables quick in-place updates. However, if the dictionary’s data should not be changed, consider using
types.MappingProxyType
to create a read-only view of the dictionary.
Example: Mutable and Immutable in a Nested Dictionary#
Consider a dictionary with nested structures that mix mutable and immutable types:
my_dict = {
"name": "Alice", # Immutable string as value
"grades": [85, 90], # Mutable list as value
"details": { # Nested dictionary (mutable)
"age": 25, # Immutable integer
"courses": ["Math", "Science"] # Mutable list
}
}
- Modifying Immutable Value: Changing
"name"
will simply update the reference. - Modifying Mutable Value: Adding to the
grades
list orcourses
list will change the existing object, which can be useful but also introduces risks if not managed carefully.
Summary#
- Dictionaries are mutable: You can add, remove, and modify key-value pairs without creating new dictionary objects.
- Keys must be immutable: Only immutable types can serve as dictionary keys, ensuring stability and consistency for efficient data retrieval.
- Values can be mutable or immutable: Mutable values, when altered, affect the dictionary directly, while immutable values are simply re-referenced when updated.
Understanding mutability in dictionaries helps prevent unintended side effects in data handling and makes it easier to write efficient and bug-free Python code.
Summary#
- Dictionaries are mutable: They allow in-place modification of key-value pairs (add, update, delete).
- Keys must be immutable: Only immutable types (like strings, numbers, and tuples) can be dictionary keys, ensuring reliable data retrieval.
- Values can be mutable or immutable:
- Mutable values (e.g., lists, other dictionaries) affect the dictionary directly when modified.
- Immutable values (e.g., integers, strings) create new references if updated.
Memory Management in Python#
In Python, managing memory and references can lead to some interesting and complex scenarios, especially when dealing with mutable and immutable objects. Let's walk through some key complex scenarios and their explanations.
1. Reassigning References (Overwriting Variables)#
When you reassign a variable to a new object, the reference changes, and the original object may become unreachable, eventually leading to garbage collection.
a = [1, 2, 3]
b = a # b and a point to the same list
a = [4, 5, 6] # a now points to a new list, but b still points to the old list
print(b) # Output: [1, 2, 3]
-
a
originally pointed to [1, 2, 3]
, but after reassigning it to [4, 5, 6]
, b
still points to the original list.- The original list
[1, 2, 3]
will be collected by Python's garbage collector once there are no references to it.
2. Passing Mutable Objects to Functions#
When mutable objects are passed to functions, any changes made to the object inside the function will affect the original object outside the function, because both the caller and the function reference the same object in memory.
def modify_list(lst):
lst.append(4)
a = [1, 2, 3]
modify_list(a)
print(a) # Output: [1, 2, 3, 4]
-
a
is passed by reference (object reference). The list a
is modified inside the function, which also reflects outside the function because the list is mutable.
3. Reassigning Mutable Objects Inside Functions#
Even though mutable objects are passed by reference, reassigning the variable to a new object inside a function does not change the original object outside the function. It just creates a new reference inside the function.
def modify_list(lst):
lst = [4, 5, 6] # This reassigns lst to a new object
a = [1, 2, 3]
modify_list(a)
print(a) # Output: [1, 2, 3]
- The
lst = [4, 5, 6]
inside the function reassigns lst
to point to a new list, but this does not affect a
outside the function, which still points to the original list [1, 2, 3]
.
4. Multiple References to the Same Object#
In Python, multiple variables can reference the same object. Changes to that object will reflect across all variables pointing to it.
a = [1, 2, 3]
b = a # b references the same list as a
c = a # c also references the same list as a
a.append(4)
print(b) # Output: [1, 2, 3, 4]
print(c) # Output: [1, 2, 3, 4]
-
a
, b
, and c
all point to the same list. When a.append(4)
is called, it affects all references because they all refer to the same list in memory.
5. Garbage Collection and Circular References#
Python uses garbage collection to automatically manage memory. However, circular references (where two or more objects reference each other) can sometimes cause issues. Python's garbage collector can handle most cases, but there are scenarios where objects with circular references may not be immediately collected.
import gc
class Node:
def __init__(self, value):
self.value = value
self.next = None
node1 = Node(1)
node2 = Node(2)
node1.next = node2
node2.next = node1 # Circular reference
del node1
del node2 # Circular reference prevents immediate collection
gc.collect() # Explicitly call garbage collection to clean up circular reference
- The objects
node1
and node2
reference each other, which creates a circular reference.- Normally, these objects would not be immediately garbage collected because of the circular reference, but Python’s garbage collector eventually cleans it up when
gc.collect()
is called.
6. Global and Local Variables (Reference Behavior)#
The scope of variables can affect how they interact with references, especially when you’re working with global and local variables.
a = [1, 2, 3] # Global variable
def modify_global():
global a # Declare that we want to modify the global variable
a = [4, 5, 6] # Reassign `a` to a new list
modify_global()
print(a) # Output: [4, 5, 6]
- The
global
keyword allows the function to modify the global variable a
. Without it, Python would treat a
as a local variable, which would not affect the global a
.
7. Shallow vs Deep Copying#
When copying objects, Python provides two types of copying:
- Shallow Copy: Copies the outer object, but references to inner objects are shared between the original and the copy.
- Deep Copy: Creates a complete independent copy of both the outer object and all inner objects.
import copy
a = [[1, 2], [3, 4]]
# Shallow copy
b = copy.copy(a)
b[0].append(3)
print(a) # Output: [[1, 2, 3], [3, 4]]
print(b) # Output: [[1, 2, 3], [3, 4]]
# Deep copy
c = copy.deepcopy(a)
c[0].append(4)
print(a) # Output: [[1, 2, 3], [3, 4]]
print(c) # Output: [[1, 2, 3, 4], [3, 4]]
- Shallow Copy: Changes to the inner list affect both
a
and b
, because the inner lists are shared.- Deep Copy:
c
is entirely independent of a
, including its inner lists, so changes to c
do not affect a
.
8. Assigning Immutable Objects#
For immutable objects like integers, strings, and tuples, reassigning variables creates new objects rather than modifying the original ones. Python handles memory efficiently by reusing memory locations for immutable objects when possible.
a = 10
b = a # b points to the same object as a
a = 20 # a now points to a new object
print(b) # Output: 10
- When
a = 20
occurs, Python creates a new object for 20
, and b
still points to 10
, the original object.
9. Memory Address and id()
#
In Python, you can use the id()
function to get the memory address of an object. This can be useful for tracking whether two variables reference the same object.
a = [1, 2, 3]
b = a # b and a reference the same object
print(id(a)) # Memory address of `a`
print(id(b)) # Memory address of `b` (same as `a`)
a.append(4)
print(a) # Output: [1, 2, 3, 4]
print(b) # Output: [1, 2, 3, 4] (same object as `a`)
Summary of Complex Scenarios#
- Overwriting Variables: Reassigning a variable to a new object makes the old object unreachable and eligible for garbage collection.
- Mutable vs Immutable: Mutable objects are passed by reference and can be modified in place; immutable objects cannot be modified and are re-assigned to new objects.
- Circular References: Circular references can prevent immediate garbage collection, though Python’s garbage collector can handle them in most cases.
- Global vs Local: Global variables can be modified inside functions with the
global
keyword, while local variables are independent. - Shallow vs Deep Copy: Shallow copies share references to inner objects, while deep copies create independent copies of all objects.
- Memory Management: The
id()
function helps track the memory address of an object, and Python’s garbage collector reclaims memory when objects become unreachable.
These scenarios highlight the complexities of memory management and references in Python, especially with mutable and immutable objects. Understanding these nuances is essential for writing efficient and bug-free Python code.
Deep Dive Question#
a=[20,30];a=[50,50]
here, what happen of older address pointed by a,
In the code you provided, the original list [20, 30]
that a
was pointing to will no longer be referenced by a
once you reassign a
to a new list [50, 50]
.
What Happens to the Original Address (Object [20, 30]
)#
Here’s a breakdown of what occurs:
- First Assignment:
a = [20, 30]
- A new list object
[20, 30]
is created in memory. -
a
is a reference to this list, meaninga
now points to the memory address of[20, 30]
. -
Second Assignment:
a = [50, 50]
- A new list object
[50, 50]
is created in memory. -
a
is now updated to point to this new list, so it no longer points to the original list[20, 30]
. -
What Happens to
[20, 30]
: - Since
a
no longer references[20, 30]
, the original list now has no references to it. - In Python, when an object has no references, it becomes eligible for garbage collection.
- Python’s garbage collector will eventually reclaim the memory used by
[20, 30]
, although the exact timing depends on Python’s memory management system and when it decides to clean up.
Summary#
- The original list
[20, 30]
becomes unreachable whena
is reassigned. - Python’s garbage collector will eventually reclaim this memory because the list is no longer referenced by any variable.