How the Daniel 2 Ratings are CalculatedTo show how these ratings are calculated, consider the following example.The following game results occured in the 2000 football season:
Tennessee defeated Southern Mississippi. Let's say we want to calculate Southern Miss' total rating. To do that, we want to compare Southern Miss to every other team in Division I-A. To show how this comparison works, consider some examples. Southern Miss defeated Alabama. Obviously, this is good for Southern Miss' rating. Let's stipulate that, by virtue of its victory over Alabama, Southern Miss gains one rating point. Southern Miss did not play South Carolina. However, Southern Miss did defeat Alabama, who in turn defeated South Carolina. Thus, Southern Miss does have an indirect "victory" over South Carolina. This, too, is good for Southern Miss' rating, but not as good as if Southern Miss had defeated South Carolina directly. Therefore, we only give Southern Miss half a rating point by virtue of its indirect "victory" over South Carolina. Southern Miss did not play Georgia, either. However, it did beat Alabama, who beat South Carolina, who in turn beat Georgia. So, we can say that Southern Miss has an indirect "victory" over Georgia, although this "victory" is even more indirect than the victory over South Carolina. Thus, it gets only half the rating points it got for its "victory" over South Carolina, or a fourth of a rating point. As you can see, for each extra game needed to get an indirect "victory," the rating points a team gets from that victory decrease by half. A team not only gains rating points for defeating other teams directly or indirectly, but it also loses rating points for losing to other teams directly and indirectly. For example, Southern Miss was defeated by Tennessee. By virtue of this loss, Southern Miss loses a rating point. To calculate a team's total rating, simply sum all the rating points it gets for its "victories" over all the other teams, and subtract the points it loses for its "loses" to all the other teams. Note that the teams being rated must all be connected by games before these ratings have any meaning. A More Mathematical ExplanationThe following is a mathematical explanation. Please excuse any mathematical incorrectness in the following description; my purpose here is to explain things in a more concise and detailed way, not to be mathematically rigorous.Let A, B, and C be any three teams. If A has defeated B at any point in the season, then we say that A is better than B, and denote this A>B. (Note that when we say A is better than B, we mean that A has performed better than B in that game. It does not mean that A is a better team overall.) Note that it is possible that both A>B and B>A if A and B play each other twice, something that is not possible for real numbers. Now, we describe a transitive property for teams: if A>B, and B>C, then A>C. Thus, even if A and C do not play each other, we can say that A is transitively better than C. A sequence of teams, arranged in descending order, for which repeated application of the transitive property yields the result A>B is called a transitive path from A to B. For example, A>D>E>F>B is a transitive path from A to B. This transitive property of teams is similar to the transitive property of, for example, real numbers. However, there is a crucial difference: the property of being better than another team is something that must be quantifed. To establish ratings, we cannot simply say that A is transitively better than B; we have to say how much better A is than B. How much better A is than B is determined by the number of applications of the transtitive property it takes to prove that A>B. If A>B, then there must be a transitive path from A to B. We define a function, denoted L(A,B), to be the length of the shortest transitive path from A to B. (The length of the shortest transitive path is the number of teams in the path minus one. The reason for this is that we like to think of teams as evenly-spaced "points" on the path, with A and B the "endpoints" of the path. Thus, the length of the path is the number of teams on the path minus one, or, equivalently, the number of games on the path.) If A is not transitively better than B, then L(A,B) is undefined. Now, define a Transitive Power Funciton:
/ 2^(1-L(A,B)), if A>B
T(A,B) = |
\ 0, if not A>B
This tells us how much better A is than B. Note that T(A,B) decreases
exponentially with L(A,B). Given these definitions, to calculate a team's rating (denoted R(A)), use the following formula:
R(A) = Sum [ T(A,B) - T(B,A) ]
B
where the summation is taken over the set teams we are ranking. (This
is not necessarily the set of all teams. In the Daniel 2 Ratings,
only Division I-A teams are ranked, but paths can go through
non-Division-I-A teams.)
Discussion of the Python Script that calculates the RatingsCurrently, I calculate the ratings using a Python script.It is rather trivial to read in input and tabulate final scores, so I don't describe them in detail here. See the script if you're curious.
The most difficult part of the calculation is determining the shortest
transtive path between teams. The script does that by using a
recursive descent algorithm. Basically, what it does is create a 2-D
dictionary
Each entry in this 2-D array is initialized with a large positive
number, 9999,
indicating that the shortest transtive path has not yet been found. The
script then chooses a school, say, for example, Penn State. For each
team that Penn State defeated, it stores the a value of 1 in the 2-D
array (i.e. Now, if, somewhere in the recursive search, a team is encountered to which a transitive path has already been found, then the script checks whether the current path length to that team is shorter than the one found previously. If it is, then the script overwrites the current path length in the table, and recursively checks the teams victories. If the current path is longer, then it is a stopping condition; there is no need to further consider that team or any of the teams it has defeated, since an equal or shorter path to it has already been found. Once the 2-D array is finished for each team, the transitive power function is applied and total rankings are calculated in a straightforward manner. The script to do this is surprisingly small, only about 300 lines; and 115 of those are just to list the teams in Division I-A. |
Menu
Links
Questions, Comments?
|