Place the points at (0,1), (1,3), (1,4), (3,5) and (4,6).
Similar to variance, we can measure the area of the rectangle formed from this point to the mean data point.
The area of this rectangle is the contribution to the covariance for this point!
The contribution to the covariance is the area represented by the rectangle, but what we really are doing is multiplying the x and y deviations from their respected mean. These deviations can be negative, so when we multiply them, we can get a positive or negative number. The area of the rectangle tells how large the points' contributions to the covariance but we still have to decide if it's positive or negative.
Move the points around so the sign of the slope of the best-fitting line is the same as you thought above and note the sign of the covariance.
Now place the points at, (0,1), (2,2), (4,3), (5,4) ,(6,4) and again, picture in your mind, a line that would pass through the data in such a way that it would pass through or near as many lines as possible.
Again, move the points to different positions so the sign of the slope of the best-fitting line is the same as you thought above.