Python: sorting a dependency list -
i'm trying work out if problem solvable using builtin sorted() function or if need myself - old school using cmp have been relatively easy.
my data-set looks like:
x = [ ('business', set('fleet','address')) ('device', set('business','model','status','pack')) ('txn', set('device','business','operator')) ....the sort rule should value of n & y y > n, x[n][0] not in x[y][1]
although i'm using python 2.6 cmp argument still available i'm trying create python 3 safe.
so, can done using lambda magic , key argument?
-== update ==-
thanks eli & winston! didn't think using key work, or if suspected shoe horn solution isn't ideal.
because problem database table dependencies had create minor add-on eli's code remove item list of dependencies (in designed database wouldn't happen, lives in magical perfect world?)
my solution:
def topological_sort(source): """perform topo sort on elements. :arg source: list of ``(name, set(names of dependancies))`` pairs :returns: list of names, dependancies listed first """ pending = [(name, set(deps)) name, deps in source] emitted = [] while pending: next_pending = [] next_emitted = [] entry in pending: name, deps = entry deps.difference_update(set((name,)), emitted) # <-- pop self dep, req py2.6 if deps: next_pending.append(entry) else: yield name emitted.append(name) # <-- not required, preserves original order next_emitted.append(name) if not next_emitted: raise valueerror("cyclic dependancy detected: %s %r" % (name, (next_pending,))) pending = next_pending emitted = next_emitted
what want called topological sort. while it's possible implement using builtin sort()
, it's rather awkward, , it's improve implement topological sort straight in python.
why going awkward? if study 2 algorithms on wiki page, both rely on running set of "marked nodes", concept that's hard contort form sort()
can use, since key=xxx
(or cmp=xxx
) works best stateless comparing functions, particularly because timsort doesn't guarantee order elements examined in. i'm (pretty) sure solution does utilize sort()
going end redundantly calculating info each phone call key/cmp function, in order around statelessness issue.
the next alg i've been using (to sort javascript library dependancies):
edit: reworked greatly, based on winston ewert's solution
def topological_sort(source): """perform topo sort on elements. :arg source: list of ``(name, [list of dependancies])`` pairs :returns: list of names, dependancies listed first """ pending = [(name, set(deps)) name, deps in source] # re-create deps can modify set in-place emitted = [] while pending: next_pending = [] next_emitted = [] entry in pending: name, deps = entry deps.difference_update(emitted) # remove deps emitted lastly pass if deps: # still has deps? recheck during next pass next_pending.append(entry) else: # no more deps? time emit yield name emitted.append(name) # <-- not required, helps preserve original ordering next_emitted.append(name) # remember emitted difference_update() in next pass if not next_emitted: # entries have unmet deps, 1 of 2 things wrong... raise valueerror("cyclic or missing dependancy detected: %r" % (next_pending,)) pending = next_pending emitted = next_emitted
sidenote: is possible shoe-horn cmp()
function key=xxx
, outlined in python bug tracker message.
python sorting topological-sort
No comments:
Post a Comment