Does Jameda discriminate against non-paying users? Part 2: New Data, new Insights
Motivation
In the first part of this series, we investigated a claim made in an article by the newspaper “Die Zeit”. The claim was that the popular German physicians rating platform Jameda favors its paying users while discriminating against non paying physicians. Using a larger and more robust dataset than the original analysis by “Die Zeit”, we confirmed many of their findings. We found that paying physicians on average have much higher ratings and suspiciously low numbers of poor ratings. However, we took a stance in disagreeing with the conclusion of the article which didn’t account for alternative explanations for these observations. As these findings are only based on correlation, we argued that this alone can’t be seen as proof for Jameda’s favoring of paying members. Especially, as there is a very intuitive theory why paying physicians might have better ratings: Paying to be a premium member on the platform might be related to other positive traits of the physician leading to better ratings. For example, doctors who value their reputation highly might be more careful in interacting with their patients and also more willing to be a paying user. Hence, the result of our first analysis was inconclusive.
However, there still was a credible claim stated by the original article. On Jameda, physicians can report ratings they disagree with. This (temporary) removes the reviews from the site and starts a validation process. The rating’s comment is checked by Jameda and can be removed permanently, if it violates certain rules. It could be that premium members report negative ratings more often. This seems intuitive. Premium members are probably more engaged in cultivating their profiles and also more active in general. In contrast, non paying members might have never looked up their own profile at all. Thus, missing out on the opportunity to report any negative reviews. The author from “Die Zeit” asked Jameda whether they remove more negative reviews from premium users’ profiles. Jameda stated, that they don’t have any data on this.
Well, no worries Jameda. I got your back! I’ll gladly offer some data myself to help you out with this question. With a little delay of about three years, we’ll finally conclude our analysis. Using new data, we’ll be able to shed light on the original question!
As always, you can download this notebook on and . Unfortunately, I won’t be able to share the underlying data this time. Sorry for that!
The data
The data was scraped once a day and stored to a SQLite database. It is made up of two parts. The first part consists of the information on the physicians. The second part consists of the reviews. (Refer to the first post for more details on the data content)
We observed changes in the reviews for a period of about nine months from January 2020 to September 2020. Thus, we were able to identify all reviews that were removed by Jameda during this period because they were reported by a physician. Also, we can see the result of the validation process for some of those. They could have been removed after being reported or could have been re published.
First, we create a class for reading the data from our SQLite database:
import datetime
import sqlite3
import logging
import pandas as pd
import numpy as np
import plotly.express as px
import plotly.io as pio
from sqlite3 import Error
from pandas.tseries.frequencies import to_offset
from IPython.display import Markdown as md
pio.renderers.default = "notebook_connected"
px.defaults.template = "plotly_white"
pd.options.display.max_columns = 100
pd.options.display.max_rows = 600
pd.options.display.max_colwidth = 100
np.set_printoptions(threshold=2000)
log = logging.getLogger()
log.setLevel(logging.DEBUG)
# Data for original observation period
DATA_OLD = "../data/raw/2020-09-23_jameda.db"
# To check for longer term changes we got some new data
DATA_NEW = "../data/raw/2021-02-13_jameda.db"
DATE_START_WAVE2 = "2021-02-10"
class DB:
def __init__(self, db):
"""
Connect to sqllite DB expose connection and cursor of instance
"""
self.cursor = None
self.conn = self.create_connection(db)
self.conn.row_factory = sqlite3.Row # return column names on fetch
try:
self.cursor = self.conn.cursor()
except Exception as e:
log.exception(f"Error getting cursor for DB connection: {e}")
def create_connection(self, db_path):
"""return a database connection """
try:
conn = sqlite3.connect(db_path)
except Error:
log.exception(f"Error connecting to DB: {db_path}")
return conn
def send_single_statement(self, statement):
""" Send single statement to DB """
try:
self.cursor.execute(statement)
except Error:
log.exception(f"Error sending statement: {statement}")
self.conn.rollback()
return None
else:
log.info(f"OK sending statement: {statement}")
self.conn.commit()
return True
def select_and_fetchall(self, statement):
""" Execute a select statement and return all rows """
try:
self.cursor.execute(statement)
rows = self.cursor.fetchall()
except Exception:
log.exception("Could not select and fetchall")
return None
else:
return rows
def __del__(self):
""" make sure we close connection """
self.conn.close()
Using the class and its methods, let’s read in the data regarding the doctors’ profiles:
# from sqlite to df
db = DB(DATA_OLD)
ret = db.select_and_fetchall("SELECT * FROM doctors;")
rows = [dict(row) for row in ret]
docs = pd.DataFrame(rows)
docs["premium"] = 0
# If user has a portrait picture, he is paying member / premium user
docs.loc[docs["portrait"].notna(), "premium"] = 1
docs_unq = (
docs.sort_values(["ref_id", "date_update"])
.groupby("ref_id", as_index=False)
.agg("last")
)
# Clean doctor subject string
docs["fach_string"] = docs["fach_string"].str.replace("[\['\]]", "")
docs.head(2)
ref_id | strasse | anrede | gesamt_note | bewertungen | ort | typ | plz | score | art | entfernung | lat | lng | name_nice | name_kurz | typ_string | fach_string | url | url_hinten | snippets | canHaveReviews | portrait | date_create | date_update | errors | premium | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 80283417 | Venloer Str. 389 | 2 | 2.43 | 8 | Köln | HA | 50825 | 51.2639 | 1 | 0.1 | 50.951099 | 6.915398 | Dr. med. Brigitte Jähnig | Dr. Jähnig | Ärztin | Kinderärztin | /koeln/aerzte/kinderaerzte/dr-brigitte-jaehnig/ | 80283417_1/ | [{'type': 'positive', 'label': 'öffentlich gut erreichbar'}, {'type': 'positive', 'label': 'freu... | 1 | None | 2020-01-07 18:16:35 | 2020-01-07 18:16:35 | 0 | 0 |
1 | 80424197 | Venloer Str. 389 | 2 | 1.00 | 1 | Köln | HA | 50825 | 53.6637 | 1 | 0.1 | 50.951099 | 6.915398 | Dr. med. Andrea Steinle | Dr. Steinle | Ärztin | Internistin | /koeln/aerzte/innere-allgemeinmediziner/dr-andrea-steinle/ | 80424197_1/ | None | 1 | None | 2020-01-07 18:16:35 | 2020-01-07 18:16:35 | 0 | 0 |
This is very similar to the data we used in the first post. In this analysis, most of the columns can be ignored. We’ll focus on the ref_id
(which is the user id) and whether or not the physician is a paying (premium = 1
) or non-paying user (premium = 0
).
Next, we read in and process the reviews:
# from multiple sqlite DBs to single df
DATA = [DATA_OLD, DATA_NEW]
reviews = pd.DataFrame()
for data in DATA:
db = DB(data)
ret = db.select_and_fetchall("SELECT * FROM reviews;")
rows = [dict(row) for row in ret]
df = pd.DataFrame(rows).sort_values(["ref_id", "b_id", "date_create"])
# columns to dates
df["b_date"] = pd.to_datetime(df["b_date"], unit="s")
df["date_create"] = pd.to_datetime(df["date_create"])
df["date_update"] = pd.to_datetime(df["date_update"])
reviews = reviews.append(df)
# some processing
# force numeric content to numeric type
reviews[
[
"ref_id",
"b_id",
"b_stand",
"gesamt_note_class",
"br_total_votes",
"br_index",
"is_archived",
"kommentar_entfernt",
]
] = reviews[
[
"ref_id",
"b_id",
"b_stand",
"gesamt_note_class",
"br_total_votes",
"br_index",
"is_archived",
"kommentar_entfernt",
]
].apply(
pd.to_numeric
)
# flag wave 1 and 2
reviews["wave"] = np.where(reviews["date_create"] < DATE_START_WAVE2, 1, 2)
reviews_w2 = reviews[reviews["wave"] == 2]
reviews_w1 = reviews[reviews["wave"] == 1]
# skip incomplete days
date_firstday = reviews_w1["date_create"].min().ceil("d")
date_lastday = reviews_w1["date_create"].max().floor("d")
reviews_w1 = reviews_w1.loc[
(reviews_w1["date_create"] >= date_firstday)
& (reviews_w1["date_create"] <= date_lastday),
]
The ref_id
of each review can be related back to the corresponding physician. The b_id
refers to the unique id of a review. Moreover, we’ll need gesamt_note
which is the numerical rating of the review (from 1 = best, to 6 = worst), b_date
which is the first publication date of a review, date_create
which is the date we observed this review and b_stand
which is the status of the review (explained below).
Here, we strictly look at reviews for which we have multiple observations, i.e. reviews which have changed during the period we scraped them:
# filter for reviews with multiple entries (new entry only created when reviews changed)
num_entries = reviews_w1.groupby("b_id").size()
multi_entry = reviews_w1.loc[
reviews_w1["b_id"].isin(num_entries[num_entries > 1].index),
].sort_values(["ref_id", "b_id", "date_create"])
# filter on relevant columns
cols_change = ["ref_id", "b_date", "b_id", "b_stand", "date_create", "wave"]
multi_entry = multi_entry[cols_change].reset_index(drop=True)
Now, we come back to the b_stand
(status) variable. We know that 1
means the status is normal. This is just a regular review, containing a rating and a comment. When b_stand
is 4
it indicates that the review was reported, temporarily removed, and is being verified by Jameda (the process is explained here). A 5
tells us, that the review has a comment but no rating. Here’s an example for a reported review:
print(
reviews_w1.loc[reviews_w1["b_stand"] == 4,].iloc[
0
][["titel", "kommentar"]]
)
titel Warum ist diese Bewertung aktuell nicht online?
kommentar Dr. Herrmann hat uns die Bewertung gemeldet, da sie sie für rechtswidrig hält. Aus diesem G...
Name: 55045, dtype: object
These reviews are grayed out on the website. The rating is deleted and the original text in the title and comment is replaced by a standard message. It says something along the lines of: “This review has been reported by the physician and is under review”. Also, the rating is not displayed for them anymore.
Following, for the reviews with multiple entries, we check how b_stand
changed between observations:
# Compute direction of b_stand change: review removed or re added (after removal)
multi_entry = multi_entry[multi_entry["b_stand"].isin([1, 4, 5])]
multi_entry["b_stand_prev"] = multi_entry.groupby("b_id")["b_stand"].shift(
fill_value=np.nan
)
multi_entry["change"] = multi_entry["b_stand"] - multi_entry["b_stand_prev"]
multi_entry.loc[multi_entry["change"].isin([3, -1]), "removed"] = 1
multi_entry.loc[multi_entry["change"].isin([-3, 1]), "readded"] = 1
multi_entry.head(4)
ref_id | b_date | b_id | b_stand | date_create | wave | b_stand_prev | change | removed | readded | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 80000939 | 2016-07-14 18:17:01 | 2817015 | 1 | 2020-07-15 03:22:20 | 1 | NaN | NaN | NaN | NaN |
1 | 80000939 | 2016-07-14 18:17:01 | 2817015 | 1 | 2020-08-23 03:03:18 | 1 | 1.0 | 0.0 | NaN | NaN |
2 | 80000941 | 2016-06-10 12:49:17 | 2754134 | 1 | 2020-02-25 03:24:12 | 1 | NaN | NaN | NaN | NaN |
3 | 80000941 | 2016-06-10 12:49:17 | 2754134 | 1 | 2020-09-19 03:02:19 | 1 | 1.0 | 0.0 | NaN | NaN |
Using the change in b_stand
we can conclude whether a review has been removed (it went from 1 to 4 or 5 to 4) or re added after being removed some time before (change from 4 to 1 or 4 to 5).
Let’s store a data frame of only the reviews which have been removed or re added:
# Only reviews that were added or removed
changed = multi_entry[(multi_entry["removed"] == 1) | (multi_entry["readded"] == 1)]
changed = changed[["b_id", "removed", "readded"]].groupby("b_id").max().reset_index()
changed.head(2)
b_id | removed | readded | |
---|---|---|---|
0 | 682847 | 1.0 | NaN |
1 | 807042 | 1.0 | NaN |
# keep only unique reviews and add cols from other tables with infos
reviews_unq = reviews_w1[reviews_w1["b_stand"].isin([1, 4, 5])]
reviews_unq = reviews_unq.drop_duplicates("b_id", keep="last")
reviews_unq = reviews_unq.merge(changed, how="left", on="b_id")
doc_infos = docs[
["ref_id", "premium", "ort", "fach_string", "url", "url_hinten"]
].drop_duplicates("ref_id", keep="last")
reviews_unq = reviews_unq.merge(doc_infos, how="left", on="ref_id")
# store original rating (on removal of review grade disappears) and re add to reviews
ratings = reviews_w1[["gesamt_note_class", "b_id"]].groupby("b_id").min().reset_index()
reviews_unq = reviews_unq.merge(ratings, how="left", on="b_id")
reviews_unq = reviews_unq.replace(np.nan, 0)
# remove those without a grade (under review without previous entry)
# reviews_unq = reviews_unq[reviews_unq["gesamt_note_class_y"] > 0]
reviews_unq.head(2)
ref_id | b_id | b_stand | u_alter | kasse_privat | b_date | gesamt_note | gesamt_note_class_x | gesamt_note_formatted | bs_inhalt | br_total_votes | br_total_value | br_index | is_archived | titel | kommentar_entfernt | kommentar | header | fragen | date_create | date_update | wave | removed | readded | premium | ort | fach_string | url | url_hinten | gesamt_note_class_y | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 80000727 | 3397880 | 1 | 0 | 0 | 2017-05-23 09:48:03 | 1.00 | 1.0 | 1,0 | 0 | 2.0 | 6 | 3.0 | 0 | Sehr gute Kompetenz mit perfekten Netzwerk | 0 | Hier fühlt man sich bestens aufgehoben und wird perfekt betreut in einen sehr schönen Ambiente. | Bewertung vom 23.05.17 | [{'fragekurz': 'Behandlung', 'note': '1'}, {'fragekurz': 'Aufklärung', 'note': '1'}, {'fragekurz... | 2020-09-09 03:24:19 | 2020-09-09 03:24:19 | 1 | 0.0 | 0.0 | 1 | München | Internistin | /muenchen/aerzte/innere-allgemeinmediziner/dr-daniela-grenacher-horn/ | 80000727_1/ | 1.0 |
1 | 80000727 | 3413580 | 1 | 2 | 1 | 2017-06-01 19:11:21 | 1.00 | 1.0 | 1,0 | 0 | 3.0 | 3 | 1.0 | 0 | Eine super Ärztin mit viel Leidenschaft, Menschlichkeit und Beruf als Berufung | 0 | Durch ganz großes Glück kam Ich zu Frau Dr. Grenacher-Horn. <br />\r\n<br />\r\nIch habe noch ni... | Bewertung vom 01.06.17, gesetzlich versichert, 30 bis 50 | [{'fragekurz': 'Behandlung', 'note': '1'}, {'fragekurz': 'Aufklärung', 'note': '1'}, {'fragekurz... | 2020-09-09 03:24:19 | 2020-09-09 03:24:19 | 1 | 0.0 | 0.0 | 1 | München | Internistin | /muenchen/aerzte/innere-allgemeinmediziner/dr-daniela-grenacher-horn/ | 80000727_1/ | 1.0 |
# Only reviews that were removed With additional cols
removed = reviews_unq.loc[
(reviews_unq["removed"] == 1),
[
"b_id",
"b_date",
"date_create",
"removed",
"readded",
"premium",
"gesamt_note_class_y",
"wave",
],
].reset_index(drop=True)
removed.head(2)
b_id | b_date | date_create | removed | readded | premium | gesamt_note_class_y | wave | |
---|---|---|---|---|---|---|---|---|
0 | 4682967 | 2019-07-12 09:34:26 | 2020-03-02 03:21:15 | 1.0 | 0.0 | 1 | 6.0 | 1 |
1 | 4871698 | 2019-12-02 10:37:17 | 2020-03-02 03:21:15 | 1.0 | 0.0 | 1 | 6.0 | 1 |
Analysis
After we have cleaned and prepared the data, we can check some of its properties. First, let’s see how many new reviews are published each week during our observation period:
# Only reviews that were published during observation wave 1
reviews_unq_new_in_wave1 = reviews_unq.loc[
(reviews_unq["b_date"] >= date_firstday) & (reviews_unq["b_date"] <= date_lastday),
]
# From daily data to weekly aggregation
plt = reviews_unq_new_in_wave1[["b_date", "premium"]].set_index("b_date")
plt["reviews"] = 1
plt["non premium"] = abs(plt["premium"] - 1)
plt = plt.resample("W-MON", label="left").sum()
reviews_new_total = plt["reviews"].sum()
fig = px.bar(
plt,
x=plt.index,
y=["non premium", "premium"],
title=f"New reviews per week in 2020 (Total {reviews_new_total})",
labels={"b_date": "Date published", "variable": "Published reviews"},
barmode="stack",
)
fig.update_xaxes(dtick=7 * 24 * 60 * 60 * 1000, tickformat="%d %b", tick0="2020-01-06")
fig.show()
# Share of Premium
reviews_new_total_prem = plt["premium"].sum()
reviews_new_total_noprem = plt["non premium"].sum()
reviews_new_share_prem = (
reviews_new_total_prem / (reviews_new_total_prem + reviews_new_total_noprem) * 100
)
# descriptives
reviews_min = plt["reviews"].min()
reviews_max = plt["reviews"].max()
reviews_mean = plt["reviews"].mean()
md(
f"Each week, there are between {reviews_min} and {reviews_max} newly published reviews. The average week sees about {reviews_mean:.0f} of them. Reviews on premium profiles have a share of {reviews_new_share_prem:.0f}%."
)
Each week, there are between 191 and 862 newly published reviews. The average week sees about 617 of them. Reviews on premium profiles have a share of 42%.
Next, we compare those numbers to the number of reviews removed during the same time span:
freq_removed = reviews_unq["removed"].value_counts()
freq_removed_perc = reviews_unq["removed"].value_counts(normalize=True) * 100
freq_removed_perc_of_new = freq_removed / reviews_new_total * 100
print(
f"Removed reviews: {freq_removed[1]:.0f} ({freq_removed_perc[1]:.2f}% of all"\
f", {freq_removed_perc_of_new[1]:.2f}% of new reviews in period)"
)
Removed reviews: 459 (0.18% of all, 2.01% of new reviews in period)
Over the nine month period in which we observed all changes in the reviews on a daily basis, we find only 459 removed reviews. That amounts to only 0.18% of all reviews and 2.01% of the newly published reviews during that same observation period.
In general, the removal of reviews seems not to be very common. Still, it could have a substantial impact on total ratings. As there are only few negative reviews in general, removing those can alter the picture greatly.
As before, we visualize the removed reviews by week and check for patterns:
# From daily data to weekly aggregation
plt = removed[["date_create", "removed", "premium"]].set_index("date_create")
plt = plt.resample("W-MON", label="left").sum()
plt["non premium"] = plt["removed"] - plt["premium"]
reviews_removed_total = plt["removed"].sum()
fig = px.bar(
plt,
x=plt.index,
y=["non premium", "premium"],
title=f"Removed reviews per week in 2020 (Total {reviews_removed_total:.0f})",
labels={"date_create": "Date", "variable": "Removed Reviews"},
barmode="stack",
)
fig.update_xaxes(dtick=7 * 24 * 60 * 60 * 1000, tickformat="%d %b", tick0="2020-01-06")
fig.show()
# Share of Premium
reviews_removed_prem = plt["premium"].sum()
reviews_removed_prem_share = reviews_removed_prem / reviews_removed_total * 100
# descriptives
reviews_min = int(plt["removed"].min())
reviews_max = int(plt["removed"].max())
reviews_mean = plt["removed"].mean()
Each week, between 1 and 38 reviews are removed. In an average week there are about 12 of them. Out of all removed reviews during the observation period, those on premium profiles have a share of 24%. Hence, the share of removed reviews is substantially lower then the share of published reviews (42%, see above) on premium profiles.
So, there is no problem with reviews being disproportionately removed from premium profiles to improve their ratings? Not so fast! While this seems to hold true, it’s not the right question to ask. We’ve learned before, that in general premium users get way less poor ratings than non paying users (see first part). Also, it’s intuitive that removed reviews will predominantly have low ratings (we’ll check that in a minute). Consequently, the real question is: “Are relatively more critical reviews (i.e. those with low ratings) removed from premium profiles?”. Thus, we also need to take the ratings into account when comparing groups:
# Filter for poor reviews
reviews_poor = reviews_unq[reviews_unq["gesamt_note_class_y"] >= 4]
# How many of the poor ratings are removed by status
share = reviews_poor.groupby(["premium"])["removed"].value_counts(normalize=True) * 100
prob_removed_premium = share[1][1] / share[0][1]
print(share)
premium removed
0 0.0 98.880847
1.0 1.119153
1 0.0 96.618357
1.0 3.381643
Name: removed, dtype: float64
The answer to the above is: Yes. On premium profiles 3.4% of poor reviews are removed but only 1.1% are removed on non premium profiles. A poor rating on a premium profile is 3 times more likely to be removed compared to one on a non premium profile.
In the following, we look a bit closer at the reviews that get removed. As stated above, we’d expect them to have strictly negative ratings:
# frequency removed by rating
plt = removed[["gesamt_note_class_y", "removed", "premium"]].reset_index(drop=True)
plt = (
plt.value_counts("gesamt_note_class_y", normalize=True)
.rename("frequency")
.reset_index()
)
fig = px.bar(
plt,
x="gesamt_note_class_y",
y="frequency",
title="Rating frequency of removed reviews",
labels={"gesamt_note_class_y": "Rating"},
)
fig.update_yaxes(tickformat="%")
fig.show()
Well, our intuition was pretty much right. By far, most of the removed reviews have a poor rating. Still, a few of the removed reviews had a good rating. Turns out, those are mostly misrated cases where a positive rating was given to a critical comment. (Notice: A rating of 0 means that the review didn’t have a rating at all, i.e. it was just a comment)
Next, we ask ourselves: How long is the typical time span between a critical rating being published and it being reported (more precisely removed, as we don’t now how much time passes from report to removal)? We look at all reviews that were removed during the observation period and compare the time of removal to the time of creation (which can be well before our observation phase):
# compute duration between review published and removed in days
removed["report_duration"] = removed["date_create"] - removed["b_date"]
removed["report_duration_days"] = removed["report_duration"].map(lambda x: x.days)
removed.head(2)
b_id | b_date | date_create | removed | readded | premium | gesamt_note_class_y | wave | report_duration | report_duration_days | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 4682967 | 2019-07-12 09:34:26 | 2020-03-02 03:21:15 | 1.0 | 0.0 | 1 | 6.0 | 1 | 233 days 17:46:49 | 233 |
1 | 4871698 | 2019-12-02 10:37:17 | 2020-03-02 03:21:15 | 1.0 | 0.0 | 1 | 6.0 | 1 | 90 days 16:43:58 | 90 |
# Visualize time delta from publishing to removal for removed reviews
plt = removed[["report_duration_days", "removed"]].reset_index(drop=True)
fig = px.histogram(
plt,
x="report_duration_days",
title="Removed reviews: days between publishing and removal",
labels={"report_duration_days": "Days"},
histnorm="probability",
nbins=200,
marginal="box",
)
fig.update_yaxes(tickformat="%")
fig.show()
This chart give us some nice insights about the whole reporting / removal process:
First, the record for the most short lived comment is only four days. That might be a good proxy for the minimum reaction time of Jameda, i.e. the time between receiving a report and acting on it. About 12% of the removed reviews are removed within 20 days. But in general, it takes about three months for a review to be removed. Nonetheless, there are quite a few reviews that get removed much later. In one case, the review was removed after more than seven years!
If we compare this distribution between paying and non paying physicians, we might learn some more:
# Visualize time delta from publishing to removal for removed reviews
plt = removed[["report_duration_days", "removed", "premium"]].reset_index(drop=True)
plt = plt[(plt["removed"] == 1)]
fig = px.histogram(
plt,
x="report_duration_days",
color="premium",
title="Removed reviews: days between publishing and removal",
labels={"report_duration_days": "Days"},
histnorm="probability",
nbins=200,
marginal="box",
)
fig.update_yaxes(tickformat="%")
fig.show()
Those distributions look quite different and support our previous hypothesis: Premium users seem in fact to be more concerned about their reputation on Jameda. Critical reviews on their profiles are removed (reported) much faster. On median, this is the case after 52 days while for non premium users the median is 139 days.
As a last analysis, let’s see what happens to the removed reviews we observed after some time has passed. Following our original observation period in 2020 (wave 1), we updated our review data in February 2021 (wave 2). After about five months since the end of wave 1, how many of the removed reviews have been re published? How many have been deleted for good?
# Which removed during wave 1 are re added until wave 2
print("Removed during wave one:", removed["b_id"].shape[0])
readded = removed.merge(reviews_w2, on="b_id")
readded = readded[readded["b_stand"].isin([1, 5])]
print("Out of those, re added until wave two:", readded.shape[0])
pct_readded = readded.shape[0] / removed["b_id"].shape[0] * 100
print(f"Percent re added of removed total: {pct_readded:.1f}%")
# Perct of removed is re added by status
pct_readded_by_status = (
readded["premium"].value_counts() / removed["premium"].value_counts() * 100
)
print(f"Percent re added of removed for non premium: {pct_readded_by_status[0]:.1f}%")
print(f"Percent re added of removed for premium: {pct_readded_by_status[1]:.1f}%")
Removed during wave one: 459
Out of those, re added until wave two: 33
Percent re added of removed total: 7.2%
Percent re added of removed for non premium: 7.8%
Percent re added of removed for premium: 5.4%
When a review gets removed, it stays removed in most cases. Only 7.2% of all removed reviews are re published after five months or longer in our observation. This share differs by status. For reviews on non premium profiles the share of re added reviews is 7.8% but for premium profiles it is only 5.4%. This could indicate that Jameda does not validate all reported reviews equally and therefor favors premium users. However, the number of observations is low and there are a few necessary assumptions that can’t be checked (e.g. that the share of reported reviews violating the rules is identical). Hence, this is speculative and can’t be backed by the data.Nonetheless, reporting unpleasant reviews seems like a good strategy for physicians in order to improve their ratings.
Conclusion
This post was a sequel to the original analysis which investigated whether Jameda favors its paying users. By collecting and analyzing new data over a period of about nine months, we were able to overcome the limitations of the first analysis. In particular, we focused on insights referring to the removal of critical reviews on the profiles of physicians. The main takeaways are these:
- Jameda has a lot of active users. Each week a lot of reviews are created by patients and published
- About 42% of the created reviews are published on the profiles of premium physicians
- Critical reviews can be reported by physicians and are removed until Jameda has validated them. The share of removed reviews on premium profiles is only 23% of the total removed reviews
- Removed reviews overwhelmingly have poor ratings
- Removed reviews are seldom. However, they exclusively target poor ratings which are rare. As such, the removal can significantly alter profiles
- On premium profiles, reviews with poor ratings are three times more likely to be removed compared to those on non premium profiles
- Critical reviews on premium profiles are removed much faster than those on non premium profiles
- Once deleted, a rating is very unlikely to ever be re published. Only 7.2% of removed reviews were re published after >5 months
- There is some dubious hint, that removed reviews on premium profiles are more likely to stay removed
The last few points are the most relevant ones for answering our original question. Differences in the total ratings of physicians on Jameda are not solely due to received ratings. The removal of reviews plays a role as well: Poor reviews are removed faster and more often from premium users’ profiles. This is particularly impactful, because a deleted reviews is very unlikely to ever be re published again. As a result, at least some profiles will have inflated ratings (on average, those are premium profiles). This has serious consequences. Not only is the rating a strong signal for potential patients but Jameda also uses it as a default sorting criterion in the search. Consequently, physicians with higher ratings will get more patients through the platform.
While this might seem unfair, it’s not easy to assign guilt. It’s likely a consequence of the greater effort that premium users put in maintaining their good reputation on the platform. They are simply more inclined to report critical reviews. Also, it is not too far-fetched to believe that there are good reasons that at least some of the reviews are removed. While this is not favoring premium users, it certainly is a disadvantage for physicians that are not actively monitoring their profiles. Those are usually the non paying users.
Jameda’s main responsibility is to ensure that the reporting of reviews is not abused. This includes making sure that reviews are validated fast and re published if they don’t violate any rules. It’s questionable if this is currently the case. Only a small share of removed reviews seems to ever be re added.
However, one must also give Jameda some credit. It’s a very complicated matter of law to decide which reviews are permissible. Erring on the side of removal might be the safest strategy for them. Also, there have been some efforts to penalize abusive physicians (i.e. those that report reviews on baseless grounds). They can get their quality seal retracted.
The most critical aspect is that Jameda must treat all reported reviews equally. If they don’t, that would really be a severe case of misconduct. Unfortunately, it might be next to impossible to tell “from the outside”.
To sum up:
it would probably be too simple to judge Jameda very harshly for the outcome. Nonetheless, it’s clear that the outcome is sub optimal for at least some (potential) patients and physicians: total ratings for doctors on the platform do not always represent the unfiltered aggregate feedback of patients. Hence, patients’ choices will be biased. On average, premium profiles benefit from this and non-premium profiles are at a disadvantage.