try another color:
try another fontsize: 60% 70% 80% 90%

How to Dedupe Items for a Unique List

A common problem when dealing with lists is deduping or removing duplicate items. For a completely unique list there are a few ways to accomplish this in Python.

The Fastest Way to Dedupe

>>> yourList = list(set(yourList))

All we're doing is leveraging Python's built in functions: set() and list(). The 'set()' function will convert your list into type set; and by definition, sets only have unique entries, so this will automatically remove any duplicate items in your original list.

Since you probably still want a list, we convert the set back to a list with the 'list()' function. The list function simply overrides the set type, with the list type.

Preserving Order

Sometimes you'll want to preserve the order of a list. Since sets need to be hashable they may or may not preserve the original order of your list. To solve for this we write a slightly longer script

>>> yourList = [0,1,2,2,3,5,5,5,7,9,9]
>>> uniqueList = []
>>> for value in yourList:
>>> .... if value not in uniqueList:
>>> ........ uniqueList.append(value)

>>>uniqueList 
[0,1,2,3,5,7,9]

Here we create a new container (uniqueList) after we've processed the original list (yourList). Then we use a for loop to go through every value in the original list (yourList). If we haven't seen the value before, we add it to the new list (uniqueList). If we've seen it before, we disregard it and move to the next value in the original list (yourList).

In the end, you're left with a completely unique deduped list in the same order the original list was in.

What do *args, **kw, **kwargs Mean?

Have you ever found yourself looking at someone else's function and wondered what the argument *args, **kw, or **kwargs means / does?
Example:

def my_function(proto, *args, **kw):
    # rest of the function code here

Those arguments are called "Keyword Arguments". Essentially, they are place holders for multiple arguments, and they are extremely useful especially when you need to pass a different number of arguments each time you call the function.
Example:

>> my_function('filter', 'python', 'html', start='now')
>> my_function('proto', 'filter')
>> my_function('ask', pg='2')

args, kw, and kwargs themselves aren't "special", it's the asterisks (*) that makes it "special". One * will create a tuple out of all the one off arguments (e.g. 'filter'), where as the double ** will create a dictionary out of all the arguments that have an equals (e.g. start = 'now').
note: If you're using both the single (*) and double (**) asterisk in your function, the single (*) MUST come before the double.

Changing or Spoofing Your User Agent in Python

Probably the most desired ability of manipulating http headers is to change or "spoof" your user agent for legitimate or nefarious purposes Evil. By default python's urllib2 uses Python-urllib/2.6 as it's user agent. In order to changes the user agent and the other request headers, we'll have to make a few changes to the basic Python url request. This time I wrote a function so we can re-use it later.

Request, Fetch, and Crawl URLs

Do you need Python to fetch a URL for you? Depending on your needs, there are various ways python can request a URL:

For this article, we'll be taking a look at urllib2 for it's ease of use and flexibility.

Syndicate content