« go back

Hardware-constrained matching algorithms

These are two independent problems in the same problem domain.

1) Given a weighted bipartite graph, the optimal maximum weighted matching can be found using Edmonds Algorithm. In order to apply this algorithm to the problem domain of interest, an efficient hardware implementation is required in order to achieve the required throughput, since processing in excess of 100 graphs of 1000 nodes per second is not achievable on current conventional processors. A Boltzmann machine provides a suitable approximation for solving the problem using a parallel processing array, but it too is not suitable for an FPGA implementation due to resource requirements. Therefore, an alternative parallel implementation for solving the maximum weighted matching problem (or providing a good approximation) is sought that can be realistically implemented on an FPGA or microprocessor. A suitable algorithm needs to have the following qualities:

a. Must be able to be split into blocks of multiple parallel processes.

b. Must be deterministic.

c. Must run in a fixed time possibly dependent on the number of nodes but independent of the weights.

2) The input graph in the problem domain is transformed into a bipartite graph suitable for solving using Edmonds algorithm using the following transform:

Split each vertex into two new vertices (v, 0) and (v, 1) and replace each directed edge (u, v) with undirected edges ((u, 0), (v, 1)).


When using Edmonds algorithm to solve this maximum weighted matching problem, the weights between nodes are assumed to be independent. However, in the problem domain of interest, this is not always the case and up to at least the next 32 weights per node may be present, dependent on the previous matching:


This problem is NP hard. A greedy algorithm can be used to extract the largest pairs of weights and then the resultant bipartite graph can be solved with Edmonds algorithm. However, a more accurate approximation is sought.


Problem presented by:

Study Group contributors:
David Barton (University of Bristol)
Kamal Bentahar (University of Oxford)
Malgorzata Bladoszewska (University of Warsaw)
Dafydd Evans (Cardiff University)
Benjamin Franz (University of Oxford)
Ingrid von Glehn (University of Oxford)
John Lees-Miller (University of Bristol)
Robert Leese (Industrial Mathematics KTN)
Eddie Wilson (University of Bristol)

Related resources:

Problem brief

Initial presentation

Final presentation.

Other ICT projects

Other Study Group projects

Archived discussion


Replies to This Discussion

Reply by Industrial Mathematics KTN on April 4, 2011 at 4:23pm

Here is the initial presentation from Thales


Thales Matching Problem.pdf, 119 KB

Reply by R. Eddie Wilson on April 5, 2011 at 9:14am

Hello everybody - there are a few of us working on this problem remotely today - me (Eddie Wilson) from home; Robert Leese (somewhere on the road); and John Lees-Miller (in his office in Bristol, I think). I am going to try and drop comments and code in to the NING as I go along. Could the problem room reply back loud "yes we hear you Eddie".

Reply by John Lees-Miller on April 5, 2011 at 10:27am

Hi all,

Auction algorithms seem like a plausible starting point. Here is a recent paper on a parallel auction implementation that might be helpful (paper + poster + CUDA source):

I've also uploaded some Octave (a MATLAB clone) code that I wrote a while back for the simplest version of this auction algorithm (sequential) and an example run (pdf). It doesn't use any of the structure in the Thales problem, but maybe it will generate ideas.


 naive_auction_ap.m, 1 KB

 naive_auction_ap.pdf, 18 KB

Reply by John Lees-Miller on April 5, 2011 at 10:13pm

Update: The "sync_auction_ap" method in the attached zip implements the "synchronous" algorithm from the attached paper (Bertsekas 1988), which I think is more representative of how one would code it for an FPGA than the naive algorithm. It includes the \epsilon allowance needed to avoid cycling. It has also been extended to allow for some people (rows) to have no object (column) assigned.

When I run it on Jim's example, I get 1,111,127 as the objective value:

>> A_ex = loadJimEx;>> [m_ex, p_ex] = sync_auction_ap(A_ex);>> ap_value(A_ex, m_ex)ans =     1111127>> max(abs(ap_complementary_slackness_error(A_ex, m_ex, p_ex)))ans =   8.3963e-04The epsilon complementary slackness condition in the paper is satisfied, so I believe that this gives the optimal solution (famous last words; would be good to confirm with a Hungarian solver / LP solver).
It does take a few iterations to get there, but most of those are spent on very small refinements (just chasing the few final entries around); in practice, one would terminate it after some fixed number of iterations. Run time is about 1.5s on my desktop with MATLAB R2010b.
One issue is that the auction method doesn't get a feasible solution until it finishes, so if it is truncated early, it will be necessary to patch up the remaining unassigned elements in some suboptimal way.



Bertsekas (1988). The Auction Algorithm - A Distributed Relaxation Method for the Assignment Problem.pdf, 1.3 MB, 45 KB

Reply by R. Eddie Wilson on April 5, 2011 at 11:01pm

Hi JD - it's ball park correct I think.

I've just tested my wrapper to fix the Inf issue in munkres.m.

NB I used enm-rew-srv1 for this and it was still a couple of minutes.

However I am getting 1,083,768 as the optimal value.

Very weird that you are getting higher - but reassuringly close!

Pesky little bug(s) either in your code, my code, or in the published munkres.m!

Reply by R. Eddie Wilson on April 5, 2011 at 11:16pm

A couple of those pesky bugs fixed...

However, I now get the optimum as 1,100,040

Closer, but unfortunately still less than you!

Reply by R. Eddie Wilson on April 5, 2011 at 11:30pm

Ok, I've attached...

eddiemunkres.m --- which is a wrapper I've written for the central file exchange code munkres.m to fix its problem with "Inf". So you do something like



Because this takes a while (!) even on a big machine, I've attached a  .mat containing the answer.

JD - can you cross check this with your assignment?

The objective fn is still 1,100,040, but it could be my mistake somewhere.


 eddiemunkres.m, 1 KB

 assignmentJimEx.mat, 1 KB

 Reply by John Lees-Miller on April 6, 2011 at 8:13am

Hmm. A few quick checks before I have to run off...

Running ap_value(A_ex, ass) gives 1,100,040, as expected, so at least we're agreeing on how to compute the objective function.

I've attached the auction method solution, m_ex, for easy reference.

The two solutions differ in 61 assignments.

A curious fact is that m_ex makes fewer assignments (572 vs 587).

Both appear to be feasible (no repeated nonzeros).

I will try feeding the problem into an LP solver, if I have time today.

Gotta dash...


 m_ex.mat, 1 KB

Reply by R. Eddie Wilson on April 6, 2011 at 8:29am

See my new munkres.m and comment, which for some reason is not appearing here but on the next page!

Reply by R. Eddie Wilson on April 5, 2011 at 11:05am

Hello - basic building blocks here. I'm attaching some simple matlab code which generates appropriate example weights matrices. I think it is self-explanatory. Regs. - Eddie


buildThalesWeightsMatrix.m, 1 KB

Reply by R. Eddie Wilson on April 5, 2011 at 11:12am

Hello. Building on the Matlab code I just uploaded. Of course, for each example we want the definitive MWM (max weight matching) irrespective of speed, to test the faster algorithms against. The simplest thing to do is to use the Hungarian (Munkres) algorithm, even though this is not computationally bleeding edge. There's lots of implementations out there on the web. The best I found so far was at

In Matlab, do

A=buildThalesWeightsMatrix;   % to generate a Weights matrix, using my earlier code

A=-A;   % because the provided code is set up to min, rather than to max, then


Then go and have a cup of tea. It takes a while because of course it isnt efficient. In a moment I will build an example and upload its optimum assignment, for testing purposes.

NB this "munkres" code seems to deal gracefully with "disallowed" assignments using the Inf technique I have used here. It seems also to be happy to generate assignments in which some nodes are not used.

Reply by Dafydd Evans on April 5, 2011 at 11:49am

Hello from the problem room at Cardiff. Thanks for the code. Watch this space.

Reply by R. Eddie Wilson on April 5, 2011 at 2:13pm

Hi Dafydd - Can you give us an update on what is happening in the problem room? Do you have any students to start throwing at Matlab tasks?

I need to switch tracks to other things shortly, but I will be back later this afternoon... Eddie

Reply by R. Eddie Wilson on April 5, 2011 at 11:45am

Hello - following my thread - see attached example1.mat.

A is the weights matrix (suitably negated for minimisation).

"ass" is the optimal assignment vector.

"cost" is the corresponding optimal value of the objective function.

Interestingly only "source" node 1024 is unassigned (as seen by find(ass==0)). I have no idea whether this is typical or not.

Could we agree to make this the definitive example to test other algorithms on?


example1.mat, 333 KB

Reply by R. Eddie Wilson on April 5, 2011 at 11:54am

The interesting (?) thing you will note about the optimal assigment I just posted is that it is entirely trivial - it is [[2:1024] 0].

I think perhaps I need to find ways of generating more "screwy" weights matrices to make the example a bit more interesting!

Reply by jim on April 5, 2011 at 1:19pm

Weights example file

a set of 'simple' data for a two-source problem from email that Glen sent to David Allwright 31/3/11

Hope this helps



weights.txt, 36 KB

Reply by R. Eddie Wilson on April 5, 2011 at 1:30pm

Thanks for this Jim - it's a useful example and I'll script  a Matlab import function for it in a moment.

Actually - I think it is easy enough to generalize my earlier script to simulate more structured data. And I think I will do this - since we might want to consider data with more than 2 sources....

Reply by R. Eddie Wilson on April 5, 2011 at 2:12pm


See attached code.


produces a Matlab weights matrix from Jim's data. If you then do

A=-A;   % to turn it in to a minimisation problem, then

[ass,costs]=munkres(A);   % you get the assignment

This seems to make sense in terms of the original "source" numbers - but I'd be much obliged if someone (in the room?) would check this!


loadJimEx.m, 1 KB

Reply by Dafydd Evans on April 5, 2011 at 3:25pm

The script loadJimEx.m seems to work OK, although it misses a few on the way through e.g. 2 and 7 and 11 are not assigned to the same chain as 0, 5, 9, 13, 14, 16 ... but appear in another chain 2,7,11,17 ... (all of these belong to the same group).

At the end of the assignment, node 594 is assigned to node 3, which seems a little odd and probably inflates the cost. In fact, the "costs" returned by the script is simply Inf. We think this might be because of an infinite cost on the 594-3 link. We repeat the experiment on the first 200 points and see whether the problem persists.

Reply by Dafydd Evans on April 5, 2011 at 3:49pm

Node 594 (the penultimate node)  can only connect to the last node, 594 (the edge is of weight is 575, the weight of all other edges being implicitly infinite). However, the last node (594) node has already been assigned, so node 594 goes back to the start of the list and attaches itself to the first unassigned node, which is node 3. The weight of the 594-3 link is -Inf, so the total weight becomes -Inf. Presumably, the algorithm can't cope with this and exits.

Reply by R. Eddie Wilson on April 5, 2011 at 4:44pm

Bummer. This doesn't sound too good - is this using the munkres.m that I pointed you to? (It isn't supposed to assign edges where there are Inf costs.)

Reply by R. Eddie Wilson on April 5, 2011 at 4:48pm

You're right - I've just checked the output of my munkres.m and it does exactly what you say. So basically it is not implementing the "disallowed connection" feature which it says it does! So for normative testing, we need a better piece of code. I'll have a bit of a dig around on the net.

Reply by R. Eddie Wilson on April 5, 2011 at 11:31pm

Dafydd - see the thread started by John Lees-Miller on p1.

This problem is now fixed (I think) by my wrapper eddiemunkres.m

Reply by Shengxin Zhu on April 5, 2011 at 3:49pm

I am trying the following Graph Toolbox.

where there is  program  'grMaxMatch' - solve the maximal matching problem for the graph, using the integer programing to solve the problem.

just several lines to call the matlab function 'bintprog'

and for 'bintprog' there is some  'BranchStrategy' which might be useful ?! I am not sure

Reply by R. Eddie Wilson on April 6, 2011 at 8:27am

Hi - well the way in which I set the dummy weights wasnt quit right - although its possible that it didnt affect the solution - new version attached - but Im on a train! Cant connect to the server to try it out!

BTW Robert are you running with this today?


eddiemunkres.m, 2 KB

Reply by Dafydd Evans on April 6, 2011 at 8:43am

Here is a simple script for extracting chains from an assignment. The assignment in "assignmentJimEx.mat"  contains 8 chains, while the assignment in "m_ex.mat" contains 23 chains.

eddiemunkres.m is taking forever to run on my machine.


chains.m, 1 KB

Reply by Dafydd Evans on April 6, 2011 at 8:54am

Small typo in the new version of eddiemunkres.m - line 39 "finute_minus_inf" should be "finite_minus_inf". New version took 266 seconds on the JimEx data.

Reply by R. Eddie Wilson on April 6, 2011 at 9:36am

Dafydd - does eddiemunkres validate john Lees Millers auction algorithm code - ie do we finally agree on the optimum assignment and its cost?

Reply by David Barton on April 6, 2011 at 9:56am

Yes, the two codes now give the same results: cost function is 1111127 and the assignments match.

Reply by John Lees-Miller on April 6, 2011 at 12:26pm

OK; good to hear that we're converging.

For completeness: this morning I did my own test using an LP solver and got the same answer. This was more than I could manage in MATLAB without an internet connection, so the script is written in Ruby, instead. The script is attached. It takes a file like Jim's weights.txt as input and spits out the assignment as a chunk of MATLAB code. To run it, you need lp_solve 5.5, which you can get from I tested it on Linux, but it might run on Windows, too.


ap_lp_weights.rb, 4 KB

Reply by Robert Leese on April 6, 2011 at 11:50am

Yes, I'm in the problem room now and here until later afternoon.  Impressed with progress so far!  We are hoping to get more background on the second problem and turn some attention to that later today.

Reply by Benjamin Franz on April 6, 2011 at 10:13am

Thanks Eddie and John for your codes. We manipulated John's auction code slightly so we can see the development of the objective value per iterations in order to give us some idea of what happens when the algorithm is just stopped in the middle. From this example it seems like the number of iterations it takes to get to within 5% of the optimum value is very small (10-20 iterations) so it seem like even with the time strict time constraints this should do the trick for us.

We also did a quick back of the envelope calculation of how many iterations we would be allowed for one block of data. Assuming about 10^8 operations per core (which in practice might be much higher) and the algorithm parallelised on 32 cores we would have 100 iterations, which brings us very close to the optimum for the given data set.

Reply by Robert Leese on April 7, 2011 at 8:19am

Notes on Problem 2: to summarise the conclusions of the helpful discussions with Doug yesterday afternoon ... Problem 2 should be solved in a block, not sequential, and in this respect is similar to Problem 1.  The data set that Doug provided (most in the room have it if anyone else wishes to see it) attaches weights to triples of nodes and the matching of Problem 1 becomes a collection of triples (or paths of length 2 if you prefer thinking that way).  Again the objective is to maximise the total weight, subject to the constraint that the selected triples cannot overlap.  There was some discussion about what 'overlap' means here, which probably needs a further chat with Doug, but it seemed to mean that we could not select both (i,j,k) and (j,k,l) but we could select (i,j,k) and (k,l,m) (i<j<k<l<m).

Reply by Robert Leese on April 7, 2011 at 8:25am

It will be interesting to know whether JLM's auction algortihm can generalise to Problem 2, and if so how its run-time is affected.  I think it might be useful to think of a weight w_{ijk} on triple (i,j,k) meaning that bidder i places a bid of w_{ijk} for the pair of objects j&k.  There are dynamic programming approaches to winner determination in auctions that would solve these problems in a time linear in the number of nodes, but with an impractically large data structure.  Does the iterative auction algorithm have a number of iterations that is (roughly) independent of n?  If so then we are in very good shape, I think, since the total run time would remain roughly linear in n.  Eddie - you said that there was empirical evidence along these lines, but not a formal proof?

Reply by Dafydd Evans on April 7, 2011 at 9:12am

Here is the data provided by Doug for Problem 2.

Attachments:, 757 KB

Reply by David Barton on April 7, 2011 at 9:16am

Format of the data is

<source ID of node n>, <node n>, <node n+m>, <weight for node n+m+1>, .., <weight for node n+m+32>

Where m = 1,..32

Attached is a Matlab file to parse the file and put it in a cell array.


loadproblem2data.m, 1 KB

Reply by R. Eddie Wilson on April 7, 2011 at 9:20am



does similar, pads with NaNs for the rows with less than 35 elts

Reply by R. Eddie Wilson on April 7, 2011 at 9:38am

A suggested import code is as follows:


sources = x(:,1);

l_node = x(:,2) + 1;  % NB must index from 1 rather than 0

c_node = x(:,3) + 1; 

x = x(:,4:end);

num_rows = size(x,1);

num_nodes = 1024; 

weights=NaN(32,32,1024);   % NB major index at end

for i=1:num_rows

weights(c_node(i)-l_node(i),:,l_node(i)) = x(i,:);


Reply by Shengxin Zhu on April 7, 2011 at 9:45am

there is a bug here ?

just note the x(end,:) just have 3 coloumns


Reply by R. Eddie Wilson on April 7, 2011 at 11:33am

No - I think its fine.

Reply by John Lees-Miller on April 7, 2011 at 10:16am

We're still thinking about the 'pair bidding.' On the complexity question:

From the 1988 paper, worst case running time when using epsilon scaling (starting with a large epsilon and then reducing it in subsequent iterations) is O(NAlog(NC)), where N is the number of persons, A is the number of pairs of persons and objects that can be assigned to each other, and C is the maximum absolute object value.

It looks to me like work per iteration is O(A) (which for us is O(N), because the degree is bounded by a constant). This gives O(Nlog(NC)) iterations with epsilon scaling.

However, empirically it does seem that most people are assigned reasonable objects in the first few iterations.

Reply by R. Eddie Wilson on April 7, 2011 at 10:29am

This still needs debugging.

But I am adding some draft code to generate artificial (partially structured) sequences and their weights. For discussion / experimentation.

myfactor is a parameter essentially controlling the "noise".


gensample.m, 1 KB

Reply by R. Eddie Wilson on April 7, 2011 at 11:36am

Some extra samples from Doug for problem 1.


Problem1Weights_eg1.csv, 321 KB

Problem1Weights_eg2.csv, 326 KB

Problem1Weights_eg3.csv, 327 KB

Reply by R. Eddie Wilson on April 7, 2011 at 11:36am

And another


Problem1Weights_eg4.csv, 263 KB

Reply by R. Eddie Wilson on April 7, 2011 at 12:38pm

New version of gensample.m


gensample.m, 1 KB

Reply by Dafydd Evans on April 7, 2011 at 1:52pm

Latex source and figures from day 3 presentation.

Attachments:, 272 KB

Reply by Robert Leese on April 8, 2011 at 10:44am

From what Dave and John have just presented so clearly, it seems that the challenge in problem 2 is whether the Bertsekas-type auction algorithm can be extended to a combinatorial context.   The attached paper by Parkes and Ungar may well hold a few clues.


Iterative Combinatorial Auctions.pdf, 153 KB

Reply by John Lees-Miller on April 8, 2011 at 5:50pm

Here are the final presentation slides, tex and figures.

We missed at least one name (sorry Shengxin!) -- if there are any others, feel free to add them on.

Thanks, everyone, for a fun week!


thales.pdf, 399 KB, 291 KB

Reply by Shengxin Zhu on April 11, 2011 at 9:34am

Thank you very much, but feel free to keep me out of the list if you publish anything.

Frankly speaking I contribute little thing to the problem up to now except listening and learning!

No comments yet. Be the first.

Most read articles

Maintenance inspection design software for full-scale industrial systems

Asset inspection and maintenance is an important activity. The ability to use data from...

Special Interest Group in Environmental Risk Management

The Industrial Mathematics Special Interest Group (SIG) in Environmental Risk Management was...

Green buildings-thermal comfort and energy consumption

The objective of this Industrial CASE PhD project is to develop novel mathematical models for...