predicting Auction prices for diamonds and precious gems using
logo_r_white_edited.png
price prediction in R
 last updated - 23 February 2022
During the jewelery auction on 30.06.2021 a fine 5.12 ct. ruby (Lot. 2058) with an estimated price range of 180'000 to 280'000 CHF fetched a record price of 680'000 CHF.

Would it be possible to create a machine learning algorithm to predict such an outcome?
Let's take this case as a challenge!

The next swiss auctions are scheduled for April and June 2022.
We will publish the first prediction as soon as the auction catalog becomes available.
GEORGE_MARTENS_2016_WHITE_LARGE_500_edit
data is always in motion
To build a good prediction model it is fundamental to focus on the right data.  It's not enough just to gather some facts but also to see data from different perspectives. The price of Diamonds, for example, are also subject to fluctuations over time. Also prices are determined based on multiple dependent characteristics and external diamond grading reports like i.E. the well established GIA certification report.
tw_diamond_price_index.png
tw_price_carat.png
tw_gia.jpeg
raw data from past and live auctions
ongoing, 23.02.2022
In Zurich, two renowned auction houses offer a wide range of jewelry. For each auction we not only evaluate the 4C of a diamond (Carat, Color, Clarity and Cut) but extend the prediction model with further important input parameters to perfect the prediction.

Primary attributes        : Carat, Color, Clarity,  Cut and hammer price
Object attributes          : object specific attributes, gem trends, evaluation of the origin (mines), ....
Auction attributes        : Auction house, number of bids, did duration, number of online visitors
Misc attributes              : conversion to non-linear values, scaling, time related variables
web_koller_199.jpeg
web_koller_195.jpeg
web_koller_197.jpeg
web_tw_koller_edited.jpg
exploring the raw data
ongoing, 23.02.2022
After entering swiss based auction data from the years 2020 and 2021 for diamonds the boxplot analysis shows a balanced distribution and a normal price spread on most of the carat weights. Only diamonds between 2 and 2.5 carat are exposed to a higher price volatility due to a wider range of quality attributes.
web_box_plot.png
Data as of 23.02.2022

The prices between 2-4 carat seems is not significantly increasing.

High price range is given only in the range 4 to 4.5 carat.

Surprisingly there are low prices between 4.5 and 5 carat compared to the previous range.
Let's check the correlation coefficient in the correlation matrix using pearson method for the main original facts.
Web_corr_orig.png