This paper compares the ability of deep learning and entropic learning methods to predict the probability of the Niño3.4 index being above 0.4 (El Niño), below 0.4 (La Niña) or within both of these thresholds (neutral) at lead times of 3 up to 24 months. In particular, the performance, interpretability, and training cost of entropic learning methods, represented by the entropy-optimal Scalable Probabilistic Approximation (eSPA) algorithm, are compared with deep learning methods, represented by a Long Short-Term Memory (LSTM) classifier, trained on the same dataset. Using only data derived from observations over the period 1958-2018 and a corresponding surface-forced ocean model, the problem manifests as a canonical smalldata challenge. Relative to the LSTM model, eSPA exhibits substantially better out-of-sample performance in terms of area under the ROC curve (AUC) for all lead times at ⇠ 0.02% of the computational cost. Comparisons of AUC with other state-of-the-art deep learning models in the literature show that eSPA appears to also be more accurate than these models across all three classes. Composite images are generated for each of the cluster centroids from each trained eSPA model at each lead time. At shorter lead times, the composite images for the most significant clusters correspond to patterns representing mature or emerging/declining El Niño or La Niña states, while at longer lead times they correspond to precursor states consisting of extra-tropical anomalies. Finally, modifications to the baseline dataset are explored, showing that improvements can be made in the parsimony of the trained eSPA model without sacrificing predictive power.