It would be hard to imagine that we want to combine the regression problems and ranking problems together. They’re inherently different. Traditionally, ranking problems only concern the order of ranked items. Even you could have very bad scores for each item, the final order matters.

However, there’re situations we care both the scores for each item, as well as their order. One would argue that eventually this is a regression problem because if you predict 100% accurate on all items and therefore the order should be correct as well. This is true only if you have a perfect regression model. There’re a lot of cases where you have a nearly perfect regression model but the results are terrible in terms of ranking. For instance, you could have a very imbalanced dataset where the majority class has \( y=0 \). Thus, it is possible to achieve a near-perfect regression performance by always predicting \( 0 \), which gives a useless ranking.

D. Sculley provided an easy solution to the problem (published on KDD 2010) above. Basically, we just naively combine two problems together as a single optimization problem, balanced by a parameter as

\[\begin{align}

\min_{\mathbf{w} \in \mathbb{R}^{m}} \alpha L(\mathbf{w}, D) + (1-\alpha)L(\mathbf{w}, P) + \frac{\lambda}{2}||\mathbf{w}||^{2}_{2}

\end{align}\]

where \( \alpha \in [0,1] \). The overall optimization problem is solved through Stochastic Gradient Descent (SGD).

The paper also described efficiently sampling from \( P \) and other scalability issues.