Flaws in the Daniel 2 Rating SystemThe Daniel 2 rating system has a number of flaws that make its results dubious. Most of the flaws have to do with differences in schedule structure giving some teams an advantage.An obvious flaw in the system is the fact that teams play different numbers of games. Suppose two teams finish the season undefeated, but one played only 11 games, whereas the other played 13. The second team had the opportunity to gain two rating points that the first team didn't have. The obvious solution in many cases such as this is to just divide by the number of games played. In this case, that won't work. The Daniel 2 ratings are a point-to-point comparison between teams; every team is compared to every other team, and these team comparisons are summed to give the total rating. Thus, there is no "per game" rating, and so dividing by the number of games is not appropriate. A less obvious flaw can be seen by considering Florida State and Notre Dame. Suppose they finish the season undefeated, each playing 11 games. Even though they play the same number of games, their schedules appear quite different to the computer. The reason for this is that Florida State plays in a conference, the ACC, while Notre Dame is independent. In the ACC, every team plays every team. So, if, say, Georgia Tech beats everyone else in the conference, Florida State would not get any half-credit rating points by virtue of Georgia Tech's victories, because Daniel 2 only considers the shortest transtive path. The shortest path from Florida State to every ACC team is 1, the direct victory. So Florida State could never benefit from other ACC teams beating each other. On the other hand, suppose Georgia Tech is also on Notre Dame's schedule, and Notre Dame beats them. Notre Dame will get all the half-credit points from Georiga Tech's victories in the ACC. In fact, because Notre Dame is an independent, many of the teams on their schedule won't play each other. Thus, almost any win by Notre Dame's opponents would result in half a rating point for Notre Dame. By virtue of their schedule, Notre Dame has much more opportunity to gain half-points by their opponents victories than Florida State. A third flaw, related to the previous one, is that there are large groups of teams that have relatively few games scheduled with other groups. Division I-A is a good example. NCAA rules specify that, in order for a team to play in a bowl, that team must have a six wins over Division I-A opponents (except one I-AA team every four years or something like that). Because of this, I-A teams do not often play I-AA teams, and as such, Division I-AA teams have a certain shortness of path advantage over I-A teams. This caused some I-AA teams to appear very high in the ratings before I decided to rank only I-A. While the nature of the Daniel 2 ratings causes this to be a major problem, I believe that most other rating systems have the same problem, but to a lesser degree. I would say that the only completely satisfactory solution to this is to actually account for sparse interscheduling in some fashion. (I believe the Seattle Times does something of this sort, but with conferences.) Finally, there is a flaw that does not relate much to college football, but cuts into the general credibility of the system. Imagine if this rating system is applied to Major League Baseball. In most cases, every team would have a rating of zero, because almost every team would have at least one victory over every other team. So, this system has an implicit dependency on teams not often playing each other more than once. Possible SolutionsHere are some ideas I have for solutions.It seems that the fatal flaw in the Daniel 2 ratings system is that much information is thrown out when the system decides to only account for the Shortest Transitive Path. So, the obvious remedy is to consider all paths. The result of this line of thinking is perhaps a little obvious. If we consider all paths, the system degenerates into an RPI-like system. That is, it would be some number times your winning percentage, plus some number times your opponents winning percentage, etc. It would be slightly different than the RPI because it goes deeper than opponents' opponents, and the numbers would be different. A second idea is to consider only shortest paths, as before, but to compensate for the hiding effect caused by selecting the shortest path. For example, in the Florida State case with the ACC, we could detect the fact that Florida State can't benefit from other ACC games, and compensate for it. The basic idea is to determine the total number of possible shortest transitive paths (possible meaning that we assume that the right teams win) one can take, and then giving the team rating points based on the percentage of those paths that are actually are transitive paths.
One Other ThoughtAlmost everything written in this section is a result of my intuition about the ratings. With more careful study, I might discover cancellation effects that make the flaws in the system not as severe as I thoiught. Or I might uncover unforseen problems with my ideas for solutions. |
Menu
Links
Questions, Comments?
|