12)Python Misc.:

Python Misc.:

  • s = "".join(c for c in s if c not in ('a,'c','e'))
  • Easy to convert a string to a list of characters: list(thestr)
    • The converse is just ''.join(lst_str)
  • Conditionals in list comprehensions:
    • alst = ['%20' if c == ' ' else c for c in alst]
      • This returns a (possibly modified) element for each element in the list conditional on what the element is.
    • alst = ['%20' for c in alst if c == ' ']
      • This filters, i.e., picks what elements of the list the comprehension will operate on.
  • Any/all can be helpful:
    • Check if all elements of a list are x:
      • if all(c == ' ' for c in alst): return -1
    • Check if any tuples contain a negative value:
      • if any(x < 0 or y < 0 for (x, y) in list_ranges):
        • raise Exception("Negative start/end times don't make sense.")
  • list[::-1] reverses the list
  • list.reverse() is in-place; reversed(list) returns the reversed list (technically an iterator).
    • Similarly, list.sort() is in-place, sorted(list) returns the sorted list.
    • Can supply arguments: sorted(list, key=lambda x: x[1], reverse=True) returns a reversed sorted list of tuples sorted by the second element in each tuple.
    • For a secondary sort, use a tuple like: sorted(list, key=lambda x: (x[1], x[0]))
  • If want to reverse but keep the indices the same, use xrange's arguments:
    • for i in xrange(len(int_list) - 1, -1, -1):
  • Time complexity of Python data structures:
    • https://wiki.python.org/moin/TimeComplexity
    • list membership is O(n) so if need to do membership tests, use a (frozen)set.
      if elem not in arr
      • Use a frozenset if need immutable / hashable set (e.g., if using as a dict key or a nested set).
    • set membership checking is O(1) since it's implemented as a dummy dict
      • dict membership is O(1) for keys
  • string constants can be useful:
    • if char in string.letters
    • if char in string.punctuation
    • space = set(string.whitespace)
    • if str_digit in string.digits
  • Can easily swap variables without using a temporary variable:
    • a, b = b, a
    • E.g.: this works:
      • a = 1; b = 2
      • a, b = b + 1, a + 1
      • # a = 3, b = 2; instead of e.g. a = 3, b = 4
    • More practically, to reverse a linked list, just do:
      • def reverse_ll(head):
        • curr, prev = head, None
        • while curr:
          • curr.next, prev, curr = prev, curr,  curr.next
        • return prev
  • Object oriented / classes:
    • class Person:
      • def __init__(self, name, age):
        • self.name = name
        • self.age = age
      • def another_method(self, arg):
        • pass
  • To add to a linked list in a loop, need to set a head node at the start to be able to reference the entire thing at the end (because cur_node will be constantly updated while iterating through). Sometimes might need a dummy head node in which case will need to return dummy.next at the end.
    • E.g.:
      • def reverseList(self, head):
        • """
        • :type head: ListNode
        • :rtype: ListNode
        • """
        • if not head: return None
        • ll_arr = []
        • node = head
        • # This excludes last node, need to add back.
        • while node.next:
          • ll_arr.append(node.val)
          • node = node.next
        • ll_arr.append(node.val)
        • r_cur = r_head = ListNode(ll_arr[-1])
        • for ele in list(reversed(ll_arr[:-1])):
          • r_cur.next = ListNode(ele)
          • r_cur = r_cur.next
        • return r_head
  • When iterating through a linked list, use:
    • while node:
      Not while node.next since that will exclude the last node.
    • def reverseList(self, head):
      • """
      • :type head: ListNode
      • :rtype: ListNode
      • """
      • if not head: return None
      • ll_arr = []
      • node = head
      • while node:
        • ll_arr.append(node.val)
        • node = node.next
      • r_cur = r_head = ListNode(ll_arr[-1])
      • for ele in list(reversed(ll_arr[:-1])):
      • r_cur.next = ListNode(ele)
      • r_cur = r_cur.next
      • return r_head
    • If deleting node in singly linked, consider using runner.next.
  • If need to reverse or parse or do tree/graph traversal in a problem, consider using a stack:
    • "Two common uses for stacks are:
      • parsing (like in this problem)
      • tree or graph traversal (like depth-first traversal)
    • So remember, if you're doing either of those things, try using a stack!"
  • c.isalnum() tests whether alphanumeric character
  • s.strip() removes leading and trailing whitespace
  • To initialize a list of (empty) lists, do [[] for i in range(3)]
    • result: [[], [], []]
    • Also, if I need a visited grid or some kind of matrix, do:
      • visited = [[False for x in xrange(len(grid[0]))] for y in xrange(len(grid))]
      • or: res = [[1 for i in xrange(len(A[0]))] for i in xrange(len(A))]
    • [0 for i in xrange(5)] —> [0, 0, 0, 0, 0]
  • Possible to do an if elif else ternary in one line:
    • return num % 9 if num % 9 else 9 if num else 0
      Digital root.
  • Stacks are easily implemented with lists: just use append() for push and pop() for pop.
    Append adds to end of list, pop (without explicit index) removes and returns from end of list.
  • Queues, however, should be implemented with deques, with append() for queue and popleft() for dequeue.
    • Re: popleft(), Imagine the list going from right-to-left, where the right is the end. (I.e., add elements at the right).
    • To initialize a deque with a tuple, need to do deque([(x, y)]):
      • but can append with just q.append((x, y))
    • If don't want to use a deque, can just do pop(0).
      equivalent to popleft()
  • When updating / incrementing a dictionary, use get to simplify the null check (whether the key already exists):
    • for c in s:
      • freqs[c] = freqs.get(c, 0) + 1
    • or: my_dict[some_key] = my_dict.get(some_key, 0) + 1
    • the set equivalent is my_dict.setdefault(some_key, default), but most of the time, can either use (preferably) defaultdict or get
  • Dictionary comprehension and sorting:
    • iterating over a dict only iterates over its keys
      • i.e., for i in dict: i will be each key, not each key-value pair
      • similarly, to test for key membership, just do: if key in dict
    • to get the actual key-value items:
      • for i, j in dict.iteritems()
      • applies similarly for comprehensions: new_d = {k: v+1 for k, v in d.iteritems()}
    • sorted list of items (k-v tuples) by key:
      • sorted(d.iteritems())
    • sorted list of keys:
      • sorted(d)
    • sorted list of items (tuples) by value:
      • sorted(d.iteritems(), key=lambda x: x[1])
    • sorted list of keys by value:
      • sorted(d, key=lambda x: d[x])
  • Sometimes might need positive or negative infinity values:
    • -float('Inf')
  • Logical operator evaluation:
    http://thomas-cokelaer.info/tutorials/python/boolean.html
    • The evaluation using the and and or operators follow these rules:
      • and and or evaluates expression from left to right.
      • with and, if all values are True, returns the last evaluated value. If any value is false, returns the first one.
      • or returns the first True value. If all are False, returns the last value
    • E.g.:
    • not x: Returns True if x is True, False otherwise
    • x and y: Returns x if x is False, y otherwise
    • x or y: Returns y if x is False, x otherwise
  • To remove duplicates from a list but maintain order, can just do list(OrderedDict.fromkeys(arr)):
    • more hacky way would be using a set http://www.martinbroadhurst.com/removing-duplicates-from-a-list-while-preserving-order-in-python.html
    • seen = set()
    • return [x for x in arr if not (x in seen or seen.add(x)]
  • To prepend, use list.insert(0, x), but if doing this, consider just using a deque which has appendleft and popleft.
    • To extend but at the start, can do a[:0] = b.
    • If using a deque and extendleft, remember that it reverses the order of the iterable passed in.
  • Heaps can be very useful if I need to maintain some kind of min/max value, but it needs to be dynamically updated over time among other possible previous values:
    https://docs.python.org/2/library/heapq.html
    • Basically just initialize heap as a normal list and then use heaq.heappush(arr, x), heappop, and heapreplace to push, pop, and replace an item while maintaining min-heap invariant.
    • https://leetcode.com/problems/meeting-rooms-ii/description/
    • intervals.sort(key=lambda x: x.start)
    • # keep track of min end time
    • heap = []
    • for n in intervals:
      • # heap[0] is always the min end time
      • # this means we can reuse this room but just need to update end
      • if heap and n.start >= heap[0]:
        • heapq.heapreplace(heap, n.end)
      • # otherwise we need to allocate a new room
      • else:
        • heapq.heappush(heap, n.end)
    • return len(heap)
  • When using reduce, might need to have an initializer / default value (e.g., when it's an empty list):
    • prod = prev_prod * reduce(lambda x, y: x*y, list_ints[i+1:], 1)
  • Deep vs. shallow copying: https://www.wikiwand.com/en/Object_copying
    • https://www.cs.utexas.edu/~scottm/cs307/handouts/deepCopying.htm
    • Shallow copies duplicate as little as possible. A shallow copy of a collection is a copy of the collection structure, not the elements. With a shallow copy, two collections now share the individual elements. A chance in the underlying elements will change the copy as well.
    • Deep copies duplicate everything. A deep copy of a collection is two collections with all of the elements in the original collection duplicated. This means that the elements are unique to each copy.
  • Can't use a list as a dict key (can use tuple, but can't have a mutable type inside like a list):
    Keys need to be immutable to be hashable.
    • But can use the string representation.
    • while amount_left >= 0:
      • if (amount_left, repr(denominations[current_index+1:])) in memoize:
        • rest_poss = memoize[(amount_left, repr(denominations[current_index+1:]))]
      • else:
        • rest_poss = change_possibilities_top_down(amount_left, denominations, current_index + 1)
        • memoize[(amount_left, repr(denominations[current_index+1:]))] = rest_poss
      • num_possibilities += rest_poss
      • amount_left -= current_coin
    • Can also convert the list to a tuple or frozenset.
  • Dynamic programming & memoization:
    • When we memoize (cache a subproblem's answer / recursive function's output so we don't need to recompute it again), the memoization dictionary probably needs to be outside the function to save/share state. I.e., make a class. Interview Cake #5: https://www.interviewcake.com/question/python/coin
    • Top-down recursion can be memory-intensive because of building up the call stack. Can avoid by going bottom-up.
    • To use, always think about starting from the end, i.e., if I had an answer for everything but the last something (element, cell, subproblem, etc.), what more do I need? Basically the n+1 step of an induction proof.
    • Interview Cake:
      • "Dynamic programming is kind of like the next step up from greedy . You're taking that idea of "keeping track of what we need in order to update the best answer so far," and applying it to situations where the new best answer so far might not just have to do with the previous answer, but some other earlier answer as well.
      • So as you can see in this problem, we kept track of all of our previous answers to smaller versions of the problem (called "subproblems") in a big list called ways_of_doing_n_cents.
      • Again, same idea of keeping track of what we need in order to update the answer as we go, like we did when storing the max product of 2, min product of 2, etc in the highest product of 3 question. Except now the thing we need to keep track of is all our previous answers, which we're keeping in a list.
      • We built that list bottom-up, but we also talked about how we could do it top-down and memoize. Going bottom-up is cleaner and usually more efficient, but often it's easier to think of the top-down version first and try to adapt from there."
    • Memoization is top-down depth-first and DP is bottom-up breadth-first:
      • http://stackoverflow.com/questions/27609246/if-memoization-is-top-down-depth-first-and-dp-is-bottom-up-breadth-first-what?rq=1
      • http://blog.racket-lang.org/2012/08/dynamic-programming-versus-memoization.html
  • Don't be scared of adding null elements/children/etc. to a container, can always check for it when popping from the container and just ignore or continue if null.
  • To convert a letter to a number/index, can use ord. ord('a') = 97, so just subtract that.
  • Note that tuple comprehensions don't exist—using parentheses creates a generator. Instead, just do a list comprehension and convert it to a tuple afterward.
  • In a for-else, the else clause executes when the loop terminates naturally, i.e., without a break.
    • http://python-notes.curiousefficiency.org/en/latest/python_concepts/break_else.html
    • http://book.pythontips.com/en/latest/for_-_else.html
    • while-else can be useful too and is equivalent to
      • while condition:
        • handle_true()
      • else:
        • # condition is false now, handle and go on with the rest of the program
        • # didn't break
        • handle_false()
    • so basically the else always happens unless I exit the loop through a break
    • useful here https://leetcode.com/problems/asteroid-collision/submissions/
      • def asteroidCollision(self, asteroids):
        • """
        • :type asteroids: List[int]
        • :rtype: List[int]
        • """
        • stk = []
        • for ast in asteroids:
          • if not stk or ast > 0:
            • stk.append(ast)
            • continue
          • while stk and stk[-1] >= 0:
            • prev = -1 * stk[-1]
            • if prev < ast:
              • break
            • elif prev == ast:
              • stk.pop()
              • break
            • else:
              • stk.pop()
          • else:
            • stk.append(ast)
        • return stk
  • List appends and extends modify the list (and don't return anything). If I want to return something, just use + (i.e. concatenation):
    • This doesn't work:
      • curr = [1, 1]
      • for i in xrange(2, k + 1):
        • c1 = [0].extend(curr)
        • c2 = curr.append(0)
        • curr = [c1 + c2 for c1, c2 in zip(c1, c2)]
    • This does work:
      • for i in xrange(2, k + 1):
        • c1 = [0] + curr
        • c2 = curr + [0]
  • Can just convert a string into a set of the chars directly: str_set = set(the_str)
  • In any case where I need to do 2 for loops going forward then backward (2 pointers), remember that I can always construct the reverse pointer index from the "forward" one, i.e.:
    • for i in xrange(len(nums)):
      • j = -1 - i
    • will give me the opposite end pointer each time
    • can also do: new_j = -j - 1
  • Be careful when initializing a matrix:
    • do: self.board = [[None] * n for _ in xrange(n)]
    • not: self.board = [[None] * n] * n
    • The second way makes copies of / references to the inner list such that any modification to one row changes the other rows
  • To modify a list in place, have to use nums[:], if use nums =, then that generates new list.
  • One common Python gotcha involves default mutable arguments:
    • If we try to do something like: def func(x, arr=[]):
    • "A new list is created once when the function is defined, and the same list is used in each successive call.
    • Python’s default arguments are evaluated once when the function is defined, not each time the function is called (like it is in say, Ruby). This means that if you use a mutable default argument and mutate it, you will and have mutated that object for all future calls to the function as well."
      http://docs.python-guide.org/en/latest/writing/gotchas/
    • Instead, do:
      • def func(x, arr=None):
        • if not arr: arr = []
  • To compare dictionaries (i.e., whether the keys and values are the same), just use ==. Dictionaries are not ordered so it works.
  • or shortcircuits, so can do things like: cur.next = l1 or l2
    • This will add the rest of linked list 1 if it's not null, otherwise add l2.
  • Stack of iterators is a nice design pattern for iterative DFS:
    • http://garethrees.org/2016/09/28/pattern/
    • def search(root):
      • stack = [iter([root])]
      • while stack:
        • for node in stack[-1]:
          • stack.append(iter(children(node)))
          • break
        • else:
          • stack.pop()
    • This avoids the three problems above:
      • Pushing and popping a list is faster than calling a function in Python.
      • Lists can grow without limit, unlike the function call stack.
      • Since there’s no recursion, you can just return when you have a result:
        • def search(root):
          • stack = [iter([root])]
          • while stack:
            • for node in stack[-1]:
              • if target(node):
                • return node
              • stack.append(iter(children(node)))
              • break
            • else:
              • stack.pop()
    • "The control flow here might seem a bit tricky if you’re not used to the way that Python’s for ... else construct interacts with break. The pattern works by maintaining a stack of iterators that remember the position reached in the iteration over the children of the node at each level of the search. After pushing a new iterator on the stack, the break exits the for loop, bypasses the else: clause, and goes round the while loop again, so that it picks up the new iterator from stack[-1]. When there are no more children in the current iteration, the for loop exits via the else: clause and pops the stack. Then the next iteration of the while loop picks up the iteration at the previous level from where it left off."
  • count function is useful:
    • if s.count(c) == 1:
    • also works for arrays of nums!
      • freqs1 = {i: nums1.count(i) for i in nums1}
  • repr() is intended to be unambiguous; str() intended to be readable.
  • is / is not compares object references (tests for object identity); == / != compares values (tests for object value equality).
    is will return True if two variables point to the same object; == if the objects referred to by the variables are equal.
  • Beware of using IndexError to bounds check, since negative indices are valid and will just wrap around.
  • Note that abstract data types (e.g., stack, queue) are distinct from data structures (e.g., array, heap). The former is a mathematical model, whereas the latter is a concrete representation of data from the perspective of the implementer, not the user.
    From Wiki.
    • For example, a list is not the same sort of thing as a linked list -- one is an ADT, the other a data structure.
    • Further, abstract data types specify a sort of contract: what operations can be done (e.g., pop, push), what constraints exist on those operations (e.g., dequeue returns the first element), and what sort of data on which the operations act (e.g., nodes). Data structures, then, are just the implementations of ADTs.
  • Remember that algorithm efficiency analysis (i.e., Big O) represents the upper bound of an algorithm's running time. Only the highest order term matters, disregarding constants. For example, natural log running time has the same complexity as logarithmic (base 10) running time: O(log n).
    • I.e., the runtime is expressed in terms of how quickly it grows relative to the input, as the input gets arbitrarily large.
    • http://stackoverflow.com/questions/2307283/what-does-olog-n-mean-exactly
    • Logarithms of all bases grow at the same asymptotic rate.
    • http://stackoverflow.com/questions/17121006/confusion-around-big-o-with-natural-logs
  • Remember that parameters are passed by assignment (by reference but references passed as values), so if pass a mutable argument (like list), can mutate it in a function and have the changes reflected elsewhere / outside the scope.
    http://stackoverflow.com/questions/986006/how-do-i-pass-a-variable-by-reference
  • In a nested list comprehension, the order of for- and if-expressions is the same as it would be in a loop format.
    • I.e., if trying to duplicate:
      • for row in grid:
        • for x in row
    • would just be:
      • [x for row in grid for x in row]
  • Strings are immutable, so to do in-place need to convert to a list and do something like s[:] = list(s).
    getting a copy of s, basically.
  • Shallow vs deep copy:
    • http://stackoverflow.com/questions/17246693/what-exactly-is-the-difference-between-shallow-copy-deepcopy-and-normal-assignm
  • Instead of i = -i + 1, can use ~:
    • def is_palin(self, s):
      • for i in xrange(len(s) / 2):
        • if s[i] != s[~i]: return False
      • return True
  • Generators are nice if need to know state while iterating through a list or if the list is potentially very large / infinite:
    • https://jeffknupp.com/blog/2013/04/07/improve-your-python-yield-and-generators-explained/
    • binary tree DFS generator iterator:
      https://leetcode.com/problems/maximum-width-of-binary-tree/discuss/106707/Python-Straightforward-BFS-and-DFS-solutions
      • def widthOfBinaryTree(self, root):
        • def dfs(node, depth = 0, pos = 0):
          • if node:
            • yield depth, pos
            • yield from dfs(node.left, depth + 1, pos * 2)
            • yield from dfs(node.right, depth + 1, pos * 2 + 1)
        • left = {}
        • right = {}
        • ans = 0
        • for depth, pos in dfs(root):
          • left[depth] = min(left.get(depth, pos), pos)
          • right[depth] = max(right.get(depth, pos), pos)
          • ans = max(ans, right[depth] - left[depth] + 1)
        • return ans
  • Use list(itertools.chain(*listoflists)) to flatten a nested list.
  • In a dict comprehension, when using a ternary operator, the 'else' is just for the value:
    • d = {k: str(v) if k in str_keys else v for k, v in d.iteritems()}
    • not: d = {k: str(v) if k in str_keys else k: v for k, v in d.iteritems()}
  • If implementing a boolean method (has_next) for a list:
    • use: return bool(list)
    • or, more hacky, return not not list
  • To time a function:
    • def gen_permutes(s):
      • stuff
    • if __name__ == '__main__':
      • from timeit import timeit
      • def wrapper(func, *args, **kwargs):
        • def wrapped():
          • return func(*args, **kwargs)
        • return wrapped
      • s = 'aabfcd'
      • wrapped = wrapper(gen_permutes, s)
      • wrapped2 = wrapper(gen_permutes2, s, 0, 5)
      • g1 = round(timeit(wrapped, number=10000), 3)
      • print 'gen_permutes of {} took {} seconds to get:'.format(s, str(g1))
  • Default arguments can be useful (see, e.g., https://docs.python.org/2/tutorial/controlflow.html), but be careful when the default is a mutable object like a list, since default values are only evaluated once.
  • Can get ceiling with integer division by using negative sign, since Python [2] always rounds down / does floor. I.e.: 8 / 3 = 2 but -8 / 3 = -3. So to get positive, just add another negative: times = -(-len(B) / len(A))
  • Portable shebang line: #!/usr/bin/env python

No comments

darkmode