Cluster - DBSCAN


In this note, I want to try to apply an approach that is completely from other notes. I wanted to use chatGPT to create a Python code that I want instead of writing it myself.


NOTE : Refer to this note for my personal experience with chatGPT coding and advtantage & limitation of the tool. In general, I got very positive impression with chatGPT utilization for coding.


This code is created first by chatGPT on Feb 03 2023 (meaning using chatGPT 3.5). For this code, I haven't started from the requirement. I asked chatGPT to write the same function as this using DBSCAN algorithm.

def generate_data():

    X, y = make_moons(n_samples=200, noise=0.05, random_state=0)

    X = StandardScaler().fit_transform(X)

    return X


def dbscan(X, eps=0.3, min_samples=5):

    db = DBSCAN(eps=eps, min_samples=min_samples).fit(X)

    labels = db.labels_

    core_samples_mask = np.zeros_like(db.labels_, dtype=bool)

    core_samples_mask[db.core_sample_indices_] = True

    return labels, core_samples_mask


def display_result(X, labels, core_samples_mask):

    n_clusters_ = len(set(labels)) - (1 if -1 in labels else 0)

    unique_labels = set(labels)

    colors = [ for each in np.linspace(0, 1, len(unique_labels))]

    for k, col in zip(unique_labels, colors):

        if k == -1:

            col = [0, 0, 0, 1]


        class_member_mask = (labels == k)

        xy = X[class_member_mask & core_samples_mask]

        plt.plot(xy[:, 0], xy[:, 1], 'o', markerfacecolor=tuple(col), markeredgecolor='k', markersize=14)


        xy = X[class_member_mask & ~core_samples_mask]

        plt.plot(xy[:, 0], xy[:, 1], 'o', markerfacecolor=tuple(col), markeredgecolor='k', markersize=6)


    plt.title('Estimated number of clusters: %d' % n_clusters_)


def test_dbscan():

    X = generate_data()

    labels, core_samples_mask = dbscan(X)

    display_result(X, labels, core_samples_mask)


if __name__ == '__main__':



The result from this code is as follows :